UUM Repository | Universiti Utara Malaysian Institutional Repository
FAQs | Feedback | Search Tips | Sitemap

Normalization of common noisy terms in Malaysian online media


Samsudin, Norlela and Puteh, Mazidah and Hamdan, Abdul Razak and Ahmad Nazri, Mohd Zakree (2012) Normalization of common noisy terms in Malaysian online media. In: Knowledge Management International Conference (KMICe) 2012, 4 – 6 July 2012, Johor Bahru, Malaysia.

[img]
Preview
PDF
Download (197kB) | Preview

Abstract

This paper proposes a normalization technique of noisy terms that occur in Malaysian micro-texts.Noisy terms are common in online messages and influence the results of activities such as text classification and information retrieval.Even though many researchers have study methods to solve this problem, few had looked into the problems using a language other than English. In this study, about 5000 noisy texts were extracted from 15000 documents that were created by the Malaysian.Normalization process was executed using specific translation rules as part or preprocessing steps in opinion mining of movie reviews.The result shows up to 5% improvement in accuracy values of opinion mining.

Item Type: Conference or Workshop Item (Paper)
Additional Information: ISBN: 9789832078661 Organized by: UUM College of Art & Sciences, Universiti Utara Malaysia
Uncontrolled Keywords: noisy text, text classification, opinion mining, Malaysian online reviews
Subjects: H Social Sciences > HD Industries. Land use. Labor > HD28 Management. Industrial Management
Divisions: College of Arts and Sciences
Depositing User: Mrs. Norazmilah Yaakub
Date Deposited: 04 May 2014 10:55
Last Modified: 25 May 2015 03:06
URI: http://repo.uum.edu.my/id/eprint/10933

Actions (login required)

View Item View Item