24x7 Service; AnyTime; AnyWhere

Feature Selection and Ensemble Meta Classifier for Multiclass Imbalance Data Learning

Sainin, Mohd Shamrie and Alfred, Rayner and Alias, Suraya and Lammasha, Mohamed A.M. (2018) Feature Selection and Ensemble Meta Classifier for Multiclass Imbalance Data Learning. In: Knowledge Management International Conference (KMICe) 2018, 25 –27 July 2018, Miri Sarawak, Malaysia.

[thumbnail of KMICE 2018 134 139.pdf] PDF
Restricted to Registered users only

Download (295kB) | Request a copy


The aim of this paper is to investigate the effects of combining feature selection and ensemble classifiers on the prediction performance in addressing the multiclass imbalance data learning .This research uses data obtained from the Malaysian medicinal leaf images shape data and three other large benchmark data sets in which six ensemble methods from Weka machine learning tool were selected to perform the classification task.These ensemble methods include the AdaboostM1, Bagging, Decorate, END, MultiboostAB, and Rotation Forest.In addition, five base classifiers were used; Naïve Bayes, SMO, J48, Random Forest, and Random Tree in order to examine the performance of the ensemble methods. There are two feature selection approaches implemented which are filter-based (CfsSubsetEval, ConsistencySubsetEval and FilteredSubsetEval) and wrapper-based (WrapperSubsetEval). The results obtained from the experiments show that although the performance accuracy is not much improved, however, with less number of attributes, the classifiers are able to achieve similar accuracy or slightly improved with less processing time.In knowledge management, the findings provide important insight of which algorithm is suitable for decision making when dealing with high dimensional and large data.

Item Type: Conference or Workshop Item (Paper)
Additional Information: ISBN: 978-967-0910-87-1 Organized by: School of Computing, College of Arts and Sciences, Universiti Utara Malaysia.
Uncontrolled Keywords: Ensemble, feature selection, multiclass,imbalance, random forest, filteredsubseteval.
Subjects: H Social Sciences > HD Industries. Land use. Labor > HD28 Management. Industrial Management
Divisions: School of Computing
Depositing User: Mrs. Norazmilah Yaakub
Date Deposited: 25 Nov 2018 02:29
Last Modified: 25 Nov 2018 02:29

Actions (login required)

View Item View Item