A Clustering-Based Enhanced Classification Algorithm for Imbalanced Data

doi:10.12146/j.issn.2095-3135.201402004

Home > Archive>Volume 3, Issue 2, 2014 >35-41. DOI:10.12146/j.issn.2095-3135.201402004

A Clustering-Based Enhanced Classification Algorithm for Imbalanced Data
DOI:
                        10.12146/j.issn.2095-3135.201402004
                    
Author:
                        
                        
                    
Affiliation:
Funding:
Ethical statement:

Article

Figures

Metrics

Reference

Cited by

Materials

Abstract:

Imbalanced data exist widely in the real world and their classification is a hot topic in the field of machine learning. A clustering-based enhanced AdaBoost algorithm was proposed to improve the poor classification performance produced by the traditional algorithm in classifying the minority class of imbalanced datasets. The algorithm firstly constructs balanced training sets by the clustering-based undersampling, using K-means clustering to cluster the majority class and extract cluster centroids and then merge with all minority class instances to generate a new balanced training set. To avoid the declining of the classification accuracy caused by the shortage of training sets owing to too few minority class samples, SMOTE (Synthetic Minority Oversampling Technique) combining the clustering-based undersampling was used. Next, the misclassification loss function in the basic classifier of the AdaBoost algorithm was modified based on the costsensitive learning theory to assign asymmetric misclassification losses to samples of different classes. The experimental results show that, the proposed algorithm makes the model training samples more representative and greatly increases the classification accuracy of the minority class, keeping the overall classification performance.

Reference

Cited by

Get Citation

HU Xiaosheng, ZHANG Runjing, ZHONG Yong. A Clustering-Based Enhanced Classification Algorithm for Imbalanced Data[J]. Journal of Integration Technology,2014,3(2):35-41

Copy

Article Metrics

Abstract:
PDF:
HTML:

History

Received:
Revised:
Adopted:
Online: April 01,2014
Published:

Home

About Journal

Editorial Team

Author Center

Peer Review

Reader Center

Ethics

Contact us

中文

Get Citation

Share

Article Metrics

History