DSpace Arşivi :: by Yazar "Chen, Yueqi" değerine göre listeleniyor

Yazar "Chen, Yueqi" seçeneğine göre listele

Listeleniyor 1 - 3 / 3

Distribution enhancement for imbalanced data with generative adversarial network
(Wiley, 2024) Chen, Yueqi; Pedrycz, Witold; Pan, Tingting; Wang, Jian; Yang, Jie
Tackling imbalanced problems encountered in real-world applications poses a challenge at present. Oversampling is a widely useful method for imbalanced tabular data. However, most traditional oversampling methods generate samples by interpolation of minority (positive) class, failing to entirely capture the probability density distribution of the original data. In this paper, a novel oversampling method is presented based on generative adversarial network (GAN) with the originality of introducing three strategies to enhance the distribution of the positive class, called GAN-E. The first strategy is to inject prior knowledge of positive class into the latent space of GAN, improving sample emulation. The second strategy is to inject random noise containing this prior knowledge into both original and generated positive samples to stretch the learning space of the discriminator of GAN. The third one is to use multiple GANs to learn comprehensive probability distributions of positive class based on multi-scale data to eliminate the influence of GAN on generating aggregate samples. The experimental results and statistical tests obtained on 18 commonly used imbalanced datasets show that the proposed method comes with a better performance in terms of G-mean, F-measure, AUC and accuracy than 14 other rebalanced methods. This paper introduces three strategies to improve the ability of GAN to handle imbalanced data. The first strategy is to inject prior knowledge into the latent space of GAN. The second strategy is to inject random noise into the discriminator. The third one is to use multiple GANs to learn comprehensive probability distributions of positive class based on multi-scale data. image
A new boundary-degree-based oversampling method for imbalanced data
(Springer, 2023) Chen, Yueqi; Pedrycz, Witold; Yang, Jie
Imbalanced data constitute a significant challenge in practical applications, as standard classifiers are usually designed to work on data with balanced class label distributions. One of effective methods to solve the imbalanced problem is boundary oversampling method, which only focuses on the classification of boundary samples. However, most boundary oversampling methods roughly select boundary samples for oversampling without considering the potentially useful boundary characteristics inherent in majority (negative) class. To overcome this limitation, we propose a novel boundary-degree-based oversampling method (BDO) in this paper. The originality of BDO stemps from quantifying the degree to which each negative sample can be regarded as a boundary sample in terms of probability using information entropy. Applying the sigma rule on the quantified boundary degree, negative boundary samples are determined to indirectly select minority (positive) boundary samples for oversampling. In this way, a substantial amount of information hidden in the negative class can be mined. To further transfer the mined information to help oversample, BDO iteratively synthesizes aided boundary points along a fraudulent gradient. Oversampling finally is performed on both positive boundary samples and the aided boundary points. Experimental results completed on 15 benchmark imbalanced datasets, two multi-label datasets and one large-scale dataset in terms of G-mean, F-measure, AUC, accuracy, TPR and TNR show that BDO exhibits better performance, which is competitive with some commonly considered methods.
A New Oversampling Method Based on Triangulation of Sample Space
(Ieee-Inst Electrical Electronics Engineers Inc, 2024) Chen, Yueqi; Pedrycz, Witold; Wang, Jian; Zhang, Chao; Yang, Jie
Coping with imbalanced data is a challenging task in practical classification problems. One of effective methods to solve imbalanced problems is to oversample the minority class. SMOTE is a classical oversampling method. However, it exhibits two disadvantages, namely, a linear generation and overgeneralization. In this article, an improved synthetic minority oversampling technique (SMOTE) method, FE-SMOTE, is proposed based on the idea of the method of finite elements. FE-SMOTE not only overcomes the above two disadvantages of SMOTE but also can generate samples that are more in line with the density distribution of the original minority class than those generated by the existing SMOTE variants. The originality of the proposed method stems from constructing a simplex for every minority sample and then triangulating it to expand the region of synthetic samples from lines to space. A new definition of the relative size for triangular elements not only helps determine the number of synthetic samples but also weakens the adverse impact of outliers. Generated samples by FE-SMOTE can effectively reflect the local potential distribution structure arising around every minority sample. Compared with 16 commonly studied oversampling methods, FE-SMOTE produces promising results quantified in terms of G-mean, AUC, F-measure, and accuracy on 22 benchmark imbalanced datasets and the big dataset MNIST.

Yazar "Chen, Yueqi" seçeneğine göre listele

Sayfa Başına Sonuç

Sıralama seçenekleri