mailto:uumlib@uum.edu.my 24x7 Service; AnyTime; AnyWhere

Winsorised gini impurity: A resistant to outliers splitting metric for classification tree

Chee, Keong Ch'ng and Mahat, Nor Idayu (2014) Winsorised gini impurity: A resistant to outliers splitting metric for classification tree. AIP Conference Proceedings, 1635. pp. 716-723. ISSN 0094-243X

Full text not available from this repository. (Request a copy)

Abstract

Constructing a classification tree is sometimes complicated due to outliers occur in the data. Eliminating the outliers is the simplest option, but some important information will lose. Alternatively, one may make some amendments on the value of outliers, but the amended value is arguable in term of its suitability for classification purposes. We describe a strategy in order to identify and to handle the outliers in the process of constructing a classification tree. A Winsorised approach is suggested in estimating the impurity of the data prior to the splitting of each node of a tree. The proposed estimator provides a splitting value that resistant towards outliers in the data hence influences the performance based on plug in error rate of the tree. We examine the proposed idea on some real data sets represent various sizes of sample. The performance indicates that the proposed strategy is competitive, and sometimes shows better performance than traditional tree.

Item Type: Article
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: School of Quantitative Sciences
Depositing User: Mrs. Norazmilah Yaakub
Date Deposited: 17 Mar 2019 03:01
Last Modified: 17 Mar 2019 03:01
URI: https://repo.uum.edu.my/id/eprint/25772

Actions (login required)

View Item View Item