mailto:uumlib@uum.edu.my 24x7 Service; AnyTime; AnyWhere

GF-CLUST: A nature-inspired algorithm for automatic text clustering

Mohammed, Athraa Jasim and Yusof, Yuhanis and Husni, Husniza (2016) GF-CLUST: A nature-inspired algorithm for automatic text clustering. Journal of Information and Communication Technology (JICT), 15 (1). pp. 57-81. ISSN 1675-414X

[thumbnail of JICT 15 1  2016 57–81.pdf]
Preview
PDF
Download (6MB) | Preview

Abstract

Text clustering is a task of grouping similar documents into a cluster while assigning the dissimilar ones in other clusters.A well-known clustering method which is the K-means algorithm is extensively employed in many disciplines.However, there is a big challenge to determine the number of clusters using K-means. This paper presents a new clustering algorithm, termed Gravity Firefly Clustering (GF-CLUST) that utilizes Firefly Algorithm for dynamic document clustering. The GF-CLUST features the ability of identifying the appropriate number of clusters for a given text collection, which is a challenging problem in document clustering. It determines documents having strong force as centers and creates clusters based on cosine similarity measurement.This is followed by selecting potential clusters and merging small clusters to them. Experiments on various document datasets, such as 20 Newgroups, Reuters-21578 and TREC collection are conducted to evaluate the performance of the proposed GF-CLUST. The results of purity, F-measure and Entropy of GF-CLUST outperform the ones produced by existing clustering techniques, such as K-means, Particle Swarm Optimization (PSO) and Practical General Stochastic Clustering Method (pGSCM).Furthermore, the number of obtained clusters in GF-CLUST is near to the actual number of clusters as compared to pGSCM.

Item Type: Article
Uncontrolled Keywords: Firefly algorithm, text clustering, divisive clustering, dynamic clustering
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: School of Computing
Depositing User: Dr. Yuhanis Yusof
Date Deposited: 08 Aug 2016 04:42
Last Modified: 08 Aug 2016 04:42
URI: https://repo.uum.edu.my/id/eprint/18484

Actions (login required)

View Item View Item