Xiong, Sheng HuaWei, Xi HangChen, Zhen SongZhang, HaoPedrycz, WitoldSkibniewski, Mirosław J.2025-04-182025-04-182025Xiong, S. H., Wei, X. H., Chen, Z. S., Zhang, H., Pedrycz, W., & Skibniewski, M. J. (2025). Identifying causes of aviation safety events using wW2V-tCNN with data augmentation. International Journal of General Systems, 1-30.03081079http://dx.doi.org/10.1080/03081079.2025.2456960https://hdl.handle.net/20.500.12713/7111Identifying the causes of these safety events is crucial for safety agencies to create recommendations and for airlines to enhance procedures and mitigate hazards. This paper proposes a model to identify the causes of civil aviation safety events using a weighted Word2Vec-based Text-CNN (wW2V-tCNN) algorithm and data augmentation techniques. A corpus is built by matching narrative texts from investigation reports with cause labels from the Aviation Safety Network database. This corpus is transformed into Text-CNN inputs using a weighted sentence vector method based on word embeddings, considering word frequency and part-of-speech weighting. Additionally, a novel document balancing method is introduced for data augmentation. The proposed identification model achieves Macro-F1 and Macro-accuracy scores of 0.9803 and 0.9699, outperforming traditional methods and showing significant improvement over models like Doc2vec and SBERT. This model provides an accurate tool for safety agencies and airlines to analyze and effectively mitigate civil aviation safety events. © 2025 Informa UK Limited, trading as Taylor & Francis Group.eninfo:eu-repo/semantics/closedAccessAviation SafetyData AugmentatioText-CNNWord WeightWord2VecIdentifying causes of aviation safety events using wW2V-tCNN with data augmentationArticleWOS:0014095737000012-s2.0-8521646699610.1080/03081079.2025.2456960Q2