mailto:uumlib@uum.edu.my 24x7 Service; AnyTime; AnyWhere

Alternative Methodology of Location Model for Handling Outliers and Empty Cells Problems: Winsorized Smoothed Location Model

Hamid, Hashibah (2019) Alternative Methodology of Location Model for Handling Outliers and Empty Cells Problems: Winsorized Smoothed Location Model. Journal of Mechanics of Continua and Mathematical Sciences (04). pp. 90-108. ISSN 0973-8975

[thumbnail of JMCMS 04 2019 90-108.pdf]
Preview
PDF - Published Version
Available under License Attribution 4.0 International (CC BY 4.0).

Download (698kB) | Preview

Abstract

The location model is a familiar basis for discrimination dealing with mixed binary and continuous variables simultaneously. The binary variables create cells while the continuous variables are information that measures the difference between groups in each cell. But, if some of the created cells are empty, the classical location model rule is biased and sometimes infeasible. Interestingly, the analyses of previous studies have revealed that non-parametric smoothing approach succeeded in reducing the effects of some empty cells immensely. However, one practical drawback to the use of discrimination methods based on the location model is that the smoothing approach employed, its performance is severe when there are outliers in the data sample. The purpose of this paper is to extend these limitations of the location model with the presence of outliers and empty cells. Accordingly, a new location model rule called Winsorized smoothed location model is developed through the combination of Winsorization and non-parametric smoothing approach to address both issues of outliers and empty cells at once. Results from simulation manifests the improvement of the new rule as the rates of misclassification are dramatically declined even the data contains outliers for all 36 different simulation data settings. Findings from real dataset, full breast cancer, also clearly show that the newly developed Winsorized smoothed location model achieves the best performance compared to over than 10 existing discrimination methods. These revealed that the newly derived rule further enhanced the applicability range of the location model, as previously it was limited to the non-contaminated datasets to achieve tolerable performance. The overall investigation verifying the new rule developed offers practitioners another potential good methodology for discrimination tasks, as the rule very favourably compared to all its competitors except only one

Item Type: Article
Uncontrolled Keywords: Outliers, Winsorization, Non-Parametric Smoothing, Location Model Rule, Misclassification Rate
Subjects: Q Science > QA Mathematics
Divisions: School of Quantitative Sciences
Depositing User: Mdm. Sarkina Mat Saad @ Shaari
Date Deposited: 19 May 2024 09:12
Last Modified: 19 May 2024 09:12
URI: https://repo.uum.edu.my/id/eprint/30805

Actions (login required)

View Item View Item