Zobrazit minimální záznam
| dc.rights.license |
CC BY |
eng |
| dc.contributor.author |
Roy, A. |
cze |
| dc.contributor.author |
Bhattacharjee, Debotosh |
cze |
| dc.contributor.author |
Krejcar, Ondřej |
cze |
| dc.date.accessioned |
2025-12-05T15:41:07Z |
|
| dc.date.available |
2025-12-05T15:41:07Z |
|
| dc.date.issued |
2025 |
eng |
| dc.identifier.issn |
2352-3409 |
eng |
| dc.identifier.uri |
http://hdl.handle.net/20.500.12603/2383 |
|
| dc.description.abstract |
The Vehicular Reference Misbehavior Dataset (VeReMi) is a vital resource for advancing Intelligent Transportation Systems (ITS) and the Internet of Vehicles (IoV). However, its large size (∼7 GB) and inherent class imbalance pose significant challenges for machine learning model development. This paper presents a preprocessing framework to enhance VeReMi's usability and relevance. Through 10 % down-sampling, the dataset was reduced to ∼724MB, making it computationally manageable. Biases were addressed by balancing benign and malicious samples through synthesis and identifying benign instances using predefined criteria. A refined feature set, including key attributes like rcvTime, pos_0, pos_1, and attack_type (renamed attacker_type), was selected to improve machine learning compatibility. This preprocessing pipeline effectively maintains data integrity and preserves the representativeness of malicious patterns. The optimized dataset is well-suited for ITS and IoV applications, such as anomaly detection and network security, underscoring the crucial role of preprocessing in overcoming real-world constraints and enhancing model performance. © 2025 The Authors |
eng |
| dc.format |
p. "Article number: 111599" |
eng |
| dc.language.iso |
eng |
eng |
| dc.publisher |
Elsevier Inc. |
eng |
| dc.relation.ispartof |
Data in Brief, volume 60, issue: June |
eng |
| dc.subject |
Anomaly detection |
eng |
| dc.subject |
Cybersecurity |
eng |
| dc.subject |
Data preprocessing |
eng |
| dc.subject |
Dataset optimization |
eng |
| dc.subject |
Intelligent transportation systems |
eng |
| dc.subject |
Internet of vehicles |
eng |
| dc.subject |
Intrusion detection systems |
eng |
| dc.subject |
Machine learnin |
eng |
| dc.subject |
Network security |
eng |
| dc.subject |
Vehicular reference misbehavior dataset |
eng |
| dc.title |
Improving internet of vehicles research: A systematic preprocessing framework for the VeReMi dataset |
eng |
| dc.type |
article |
eng |
| dc.identifier.obd |
43882000 |
eng |
| dc.identifier.doi |
10.1016/j.dib.2025.111599 |
eng |
| dc.publicationstatus |
postprint |
eng |
| dc.peerreviewed |
yes |
eng |
| dc.source.url |
https://www.sciencedirect.com/science/article/pii/S2352340925003312?pes=vor&utm_source=scopus&getft_integrator=scopus |
cze |
| dc.relation.publisherversion |
https://www.sciencedirect.com/science/article/pii/S2352340925003312?pes=vor&utm_source=scopus&getft_integrator=scopus |
eng |
| dc.rights.access |
Open Access |
eng |
Soubory tohoto záznamu
Tento záznam se objevuje v následujících kolekcích
Zobrazit minimální záznam