A New Oversampling Method Based on Triangulation of Sample Space

dc.authoridWANG, JIAN/0000-0002-4316-932X
dc.authoridZhang, Chao/0000-0002-0142-0280
dc.authorwosidWANG, JIAN/F-4224-2010
dc.contributor.authorChen, Yueqi
dc.contributor.authorPedrycz, Witold
dc.contributor.authorWang, Jian
dc.contributor.authorZhang, Chao
dc.contributor.authorYang, Jie
dc.date.accessioned2024-05-19T14:47:03Z
dc.date.available2024-05-19T14:47:03Z
dc.date.issued2024
dc.departmentİstinye Üniversitesien_US
dc.description.abstractCoping with imbalanced data is a challenging task in practical classification problems. One of effective methods to solve imbalanced problems is to oversample the minority class. SMOTE is a classical oversampling method. However, it exhibits two disadvantages, namely, a linear generation and overgeneralization. In this article, an improved synthetic minority oversampling technique (SMOTE) method, FE-SMOTE, is proposed based on the idea of the method of finite elements. FE-SMOTE not only overcomes the above two disadvantages of SMOTE but also can generate samples that are more in line with the density distribution of the original minority class than those generated by the existing SMOTE variants. The originality of the proposed method stems from constructing a simplex for every minority sample and then triangulating it to expand the region of synthetic samples from lines to space. A new definition of the relative size for triangular elements not only helps determine the number of synthetic samples but also weakens the adverse impact of outliers. Generated samples by FE-SMOTE can effectively reflect the local potential distribution structure arising around every minority sample. Compared with 16 commonly studied oversampling methods, FE-SMOTE produces promising results quantified in terms of G-mean, AUC, F-measure, and accuracy on 22 benchmark imbalanced datasets and the big dataset MNIST.en_US
dc.description.sponsorshipNational Key Research and Development Program of China [2018AAA0100300]; Fundamental Research Funds for the Central Universities [DUT22YG236]; National Natural Science Foundation of China [62176040, 62172073, 62076182]en_US
dc.description.sponsorshipThis work was supported in part by the National Key Research and Development Program of China under Grant 2018AAA0100300; in part by the Fundamental Research Funds for the Central Universities under Grant DUT22YG236; and in part by the National Natural Science Foundation of China under Grant 62176040, Grant 62172073, and Grant 62076182.en_US
dc.identifier.doi10.1109/TSMC.2023.3319694
dc.identifier.endpage786en_US
dc.identifier.issn2168-2216
dc.identifier.issn2168-2232
dc.identifier.issue2en_US
dc.identifier.scopus2-s2.0-85174849310en_US
dc.identifier.scopusqualityQ1en_US
dc.identifier.startpage774en_US
dc.identifier.urihttps://doi.org10.1109/TSMC.2023.3319694
dc.identifier.urihttps://hdl.handle.net/20.500.12713/5643
dc.identifier.volume54en_US
dc.identifier.wosWOS:001091026900001en_US
dc.identifier.wosqualityN/Aen_US
dc.indekslendigikaynakWeb of Scienceen_US
dc.indekslendigikaynakScopusen_US
dc.language.isoenen_US
dc.publisherIeee-Inst Electrical Electronics Engineers Incen_US
dc.relation.ispartofIeee Transactions on Systems Man Cybernetics-Systemsen_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.snmz20240519_kaen_US
dc.subjectTopologyen_US
dc.subjectInterpolationen_US
dc.subjectTrainingen_US
dc.subjectNeural Networksen_US
dc.subjectCostsen_US
dc.subjectSolidsen_US
dc.subjectReliabilityen_US
dc.subjectFinite Element Methoden_US
dc.subjectImbalanced Learningen_US
dc.subjectOversamplingen_US
dc.subjectSimplexen_US
dc.subjectTriangulationen_US
dc.titleA New Oversampling Method Based on Triangulation of Sample Spaceen_US
dc.typeArticleen_US

Dosyalar