mailto:uumlib@uum.edu.my 24x7 Service; AnyTime; AnyWhere

Fireflyclust: an automated hierarchical text clustering approach

Mohammed, Athraa Jasim and Yusof, Yuhanis and Husni, Husniza (2017) Fireflyclust: an automated hierarchical text clustering approach. Jurnal Teknologi, 79 (5). pp. 11-22. ISSN 0127-9696

[thumbnail of JT 79 5 2017 11 22.pdf] PDF
Restricted to Registered users only

Download (815kB) | Request a copy

Abstract

Text clustering is one of the text mining tasks that is employed in search engines. Discovering the optimal number of clusters for a dataset or repository is a challenging problem. Various clustering algorithms have been reported in the literature but most of them rely on a pre-defined value of the k clusters. In this study, a variant of Firefly algorithm, termed as FireflyClust, is proposed to automatically cluster text documents in a hierarchical manner. The proposed clustering method operates based on five phases: data pre-processing, clustering, item re-location, cluster selection and cluster refinement. Experiments are undertaken based on different selections of threshold value. Results on the TREC collection named TR11, TR12, TR23 and TR45, showed that the FireflyClust is a better approach than the Bisect K-means, hybrid Bisect K-means and Practical General Stochastic Clustering Method. Such a result would enlighten the directions in developing a better information retrieval engine for this dynamic and fast growing big data era.

Item Type: Article
Uncontrolled Keywords: Firefly algorithm, clustering, data mining, swarm intelligence
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: School of Computing
Depositing User: Mrs. Norazmilah Yaakub
Date Deposited: 24 Feb 2019 07:47
Last Modified: 24 Feb 2019 07:47
URI: https://repo.uum.edu.my/id/eprint/25651

Actions (login required)

View Item View Item