mailto:uumlib@uum.edu.my 24x7 Service; AnyTime; AnyWhere

Morphological segmentation and analysis of Bangla text

Saha, G C and Saha, Hasi and Che Mat, Ruzinoor and Khan, Nur Hossain and Sarker, Bappa (2016) Morphological segmentation and analysis of Bangla text. International Journal of Interactive Digital Media (IJIDM), 4 (3). pp. 15-20. ISSN 2289-4098

[thumbnail of IJIDM 4 3 2016  15 20.pdf] PDF
Restricted to Registered users only

Download (324kB) | Request a copy

Abstract

This paper deals with lexicon and system development for word segmentation in Bangla language.Our goal in this paper is to develop a morphological segmentation algorithm that can work well for Bangla and to address the problem of unsupervised word segmentation across different languages.From a hand-corrected Bangla corpus, 5000 popular words were segmented into suffixes, prefixes and roots manually.These were the sample lexicon used as seed for next step. A system was developed using C language to automate the Segmentation process based on hand made lexical database.The System was evaluated on several pages of Bangla text and achieved a success rate of about 83.05%.In our observation the system will work with full success if twice the volume of lexicon database and this system may have a huge impact particularly to learn and use Bangla for the people which will enhance their socio-economic life greatly.

Item Type: Article
Uncontrolled Keywords: Bangla, Natural Language Processing, Lexicon, suffixes, prefixes and roots, morphological segmentation
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: School of Multimedia Technology & Communication
Depositing User: Dr. Ruzinoor Che Mat
Date Deposited: 05 Apr 2017 02:50
Last Modified: 05 Apr 2017 02:50
URI: https://repo.uum.edu.my/id/eprint/21406

Actions (login required)

View Item View Item