UUM Repository | Universiti Utara Malaysian Institutional Repository
FAQs | Feedback | Search Tips | Sitemap

Source code classification using latent semantic indexing with structural and frequency term weighting


Yusof, Yuhanis and Alhersh, Taha and Mahmuddin, Massudi and Mohamed Din, Aniza (2012) Source code classification using latent semantic indexing with structural and frequency term weighting. Research Journal of Applied Sciences, 7 (5). pp. 266-271. ISSN 1815-932X

[img] PDF
Restricted to Registered users only

Download (501kB)

Abstract

In recent years, there is an increase in the number of open source software.Hence, the demand for automatic software classification is also increasing.Latent Semantic Indexing (LSI) is an information retrieval approach that is utilized in classifying source code programs. This research proposes a Latent Semantic Indexing classifier that integrates information structural and frequency of terms in its weighting scheme.The content terms are identified by extracting words in the source code program. Based on the undertaken experiment the LSI classifier is noted to generate a higher precision and recall compared to the C4.5 algorithm. Furthermore,it is also learned that the use of structural information in the weighting scheme contribute to a better classification.

Item Type: Article
Uncontrolled Keywords: Latent semantic indexing, software classification, C4.5,term weighting, algorithm, synonomy
Subjects: Q Science > QA Mathematics > QA76 Computer software
Divisions: College of Arts and Sciences
Depositing User: Dr. Yuhanis Yusof
Date Deposited: 24 Mar 2014 03:12
Last Modified: 24 Mar 2014 03:12
URI: http://repo.uum.edu.my/id/eprint/9501

Actions (login required)

View Item View Item