DSpace Repository

Improving internet of vehicles research: A systematic preprocessing framework for the VeReMi dataset

Show simple item record

dc.rights.license CC BY eng
dc.contributor.author Roy, A. cze
dc.contributor.author Bhattacharjee, Debotosh cze
dc.contributor.author Krejcar, Ondřej cze
dc.date.accessioned 2025-12-05T15:41:07Z
dc.date.available 2025-12-05T15:41:07Z
dc.date.issued 2025 eng
dc.identifier.issn 2352-3409 eng
dc.identifier.uri http://hdl.handle.net/20.500.12603/2383
dc.description.abstract The Vehicular Reference Misbehavior Dataset (VeReMi) is a vital resource for advancing Intelligent Transportation Systems (ITS) and the Internet of Vehicles (IoV). However, its large size (∼7 GB) and inherent class imbalance pose significant challenges for machine learning model development. This paper presents a preprocessing framework to enhance VeReMi's usability and relevance. Through 10 % down-sampling, the dataset was reduced to ∼724MB, making it computationally manageable. Biases were addressed by balancing benign and malicious samples through synthesis and identifying benign instances using predefined criteria. A refined feature set, including key attributes like rcvTime, pos_0, pos_1, and attack_type (renamed attacker_type), was selected to improve machine learning compatibility. This preprocessing pipeline effectively maintains data integrity and preserves the representativeness of malicious patterns. The optimized dataset is well-suited for ITS and IoV applications, such as anomaly detection and network security, underscoring the crucial role of preprocessing in overcoming real-world constraints and enhancing model performance. © 2025 The Authors eng
dc.format p. "Article number: 111599" eng
dc.language.iso eng eng
dc.publisher Elsevier Inc. eng
dc.relation.ispartof Data in Brief, volume 60, issue: June eng
dc.subject Anomaly detection eng
dc.subject Cybersecurity eng
dc.subject Data preprocessing eng
dc.subject Dataset optimization eng
dc.subject Intelligent transportation systems eng
dc.subject Internet of vehicles eng
dc.subject Intrusion detection systems eng
dc.subject Machine learnin eng
dc.subject Network security eng
dc.subject Vehicular reference misbehavior dataset eng
dc.title Improving internet of vehicles research: A systematic preprocessing framework for the VeReMi dataset eng
dc.type article eng
dc.identifier.obd 43882000 eng
dc.identifier.doi 10.1016/j.dib.2025.111599 eng
dc.publicationstatus postprint eng
dc.peerreviewed yes eng
dc.source.url https://www.sciencedirect.com/science/article/pii/S2352340925003312?pes=vor&utm_source=scopus&getft_integrator=scopus cze
dc.relation.publisherversion https://www.sciencedirect.com/science/article/pii/S2352340925003312?pes=vor&utm_source=scopus&getft_integrator=scopus eng
dc.rights.access Open Access eng


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account