Mostrar el registro sencillo del ítem
| dc.rights.license | CC BY | eng |
| dc.contributor.author | Čech, Pavel | cze |
| dc.contributor.author | Ponce, Daniela | cze |
| dc.contributor.author | Mikulecký, Peter | cze |
| dc.contributor.author | Žváčková, Andrea | cze |
| dc.contributor.author | Mls, Karel | cze |
| dc.contributor.author | Otčenášková, Tereza | cze |
| dc.contributor.author | Tučník, Petr | cze |
| dc.date.accessioned | 2025-12-05T15:38:03Z | |
| dc.date.available | 2025-12-05T15:38:03Z | |
| dc.date.issued | 2025 | eng |
| dc.identifier.issn | 2662-995X | eng |
| dc.identifier.uri | http://hdl.handle.net/20.500.12603/2362 | |
| dc.description.abstract | This study examines the effect of synthetic data generation for balancing class distributions on the performance of classification algorithms in smart city network systems. Contrary to the assumption that data balancing improves classification performance, the analysis reveals a more complex impact. Using three publicly available network traffic benchmark datasets and four different balancing techniques, the study evaluates the performance of five classifiers on 65 classification tasks. The findings indicate that, for smaller datasets, classifiers that achieved the highest accuracy on unbalanced data did not benefit from synthetic data generation for minority classes. Although neural network-based classifiers showed improved performance with balanced data, these improvements came at the cost of lower overall classification scores. For larger datasets, balancing through random oversampling of minority classes and undersampling of majority classes helped improve classification. However, these improvements were limited to precision, with no significant gains in recall. The study offers valuable insights into using synthetic data for intrusion detection, emphasizing the challenges of intricate dependencies in network traffic data for generative models. The results align with previous research showing mixed effects of data balancing on classifier performance, contributing to a broader understanding of the limited efficacy of synthetic data in real-world network contexts. This experimental study highlights the need for a systematic benchmarking framework for synthetic data research, ensuring consistency in data balancing and classification processes. This work contributes to the ongoing discourse on the intersection of machine learning and cybersecurity, emphasizing the critical role of data in developing resilient intrusion detection systems. © The Author(s) 2025. | eng |
| dc.format | p. "Article number: 174" | eng |
| dc.language.iso | eng | eng |
| dc.publisher | Springer | eng |
| dc.relation.ispartof | SN Computer Science, volume 6, issue: 2 | eng |
| dc.subject | Attack classification | eng |
| dc.subject | Generative adversarial networks | eng |
| dc.subject | Imbalanced datasets | eng |
| dc.subject | Intrusion detection | eng |
| dc.title | The Effect of Generating Synthetic Data in Smart City Network Systems | eng |
| dc.type | article | eng |
| dc.identifier.obd | 43881928 | eng |
| dc.identifier.doi | 10.1007/s42979-025-03673-3 | eng |
| dc.publicationstatus | postprint | eng |
| dc.peerreviewed | yes | eng |
| dc.source.url | https://link.springer.com/article/10.1007/s42979-025-03673-3 | cze |
| dc.relation.publisherversion | https://link.springer.com/article/10.1007/s42979-025-03673-3 | eng |
| dc.rights.access | Open Access | eng |
| dc.project.ID | VJ02010016/Využití umělé inteligence pro zajištění kybernetické bezpečnosti Smart City | eng |