A GAN-Based Hybrid Approach for Addressing Class Imbalance in Machine Learning

Dae-Kyoo Kim1 and Yeasun K. Chung2

  1. Computer Science and Engineering, Oakland University
    Rochester, Michigan 48309, USA
    kim2@oakland.edu
  2. Spears School of Business, Oklahoma State University
    Stillwater, Oklahoma 74078, USA
    y.chung@okstate.edu

Abstract

Class imbalance is a common problem in machine learning where the majority class has a significantly higher number of instances than the minority class, which leads to bias towards the majority class. The problem can be effectively addressed by using Generative Adversarial Network (GAN) to generate realistic synthetic samples. In this work, we present a GAN-based approach that makes use of hybrid models that combine oversampling techniques with undersampling and ensemble techniques to reduce overfitting. The proposed approach was evaluated on two datasets with different level of class imbalance using six widely used classifiers and compared with two popular class balancing techniques - SMOTEENN and SMOTETomek. The results show that the proposed approach outperforms them in highly imbalanced datasets.

Key words

class imbalance, classification, GAN, hybrid model, machine learning

How to cite

Kim, D., Chung, Y. K.: A GAN-Based Hybrid Approach for Addressing Class Imbalance in Machine Learning. Computer Science and Information Systems