UUM Repository | Universiti Utara Malaysian Institutional Repository
FAQs | Feedback | Search Tips | Sitemap

Rule-based filtering algorithm for textual document

Jamil, Nurul Syafidah and Ku-Mahamud, Ku Ruhana and Mohamed Din, Aniza (2017) Rule-based filtering algorithm for textual document. International Journal of Science and Engineering Investigations, 6 (61). pp. 44-48. ISSN 2251-8843

[img] PDF
Restricted to Registered users only
Available under License ["licenses_description_cc4_by_sa" not defined].

Download (277kB) | Request a copy


Textual document is usually in unstructured form and high dimensional data.The exploration of hidden information from the unstructured text is useful to find interesting patterns and valuable knowledge.However, not all terms in the text are relevant and can lead to misclassification. Improper filtration might cause terms that have similar meaning to be removed.Thus, to reduce the high-dimensionality of text, this study proposed a filtering algorithm that is able to filter the important terms from the pre-processed text and applied term weighting scheme to solve synonym problem which will help the selection of relevant term.The proposed filtering algorithm utilizes a keyword library that contained special terms which is developed to ensure that important terms are not eliminated during filtration process.The performance of the proposed filtering algorithm is compared with rough set attribute reduction (RSAR) and information retrieval (IR) approaches.From the experiment, the proposed filtering algorithm has outperformed both RSAR and IR in terms of extracted relevant terms.

Item Type: Article
Uncontrolled Keywords: Topic Identification, Filtering Algorithm, Synonym, Textual Document
Subjects: Q Science > QA Mathematics > QA76 Computer software
Divisions: School of Computing
Depositing User: Prof. Dr. Ku Ruhana Ku Mahamud
Date Deposited: 19 Apr 2017 07:42
Last Modified: 19 Apr 2017 07:42
URI: http://repo.uum.edu.my/id/eprint/21718

Actions (login required)

View Item View Item