Saha, G C and Saha, Hasi and Che Mat, Ruzinoor and Khan, Nur Hossain and Sarker, Bappa (2016) Morphological segmentation and analysis of Bangla text. International Journal of Interactive Digital Media (IJIDM), 4 (3). pp. 15-20. ISSN 2289-4098
PDF
Restricted to Registered users only Download (324kB) | Request a copy |
Abstract
This paper deals with lexicon and system development for word segmentation in Bangla language.Our goal in this paper is to develop a morphological segmentation algorithm that can work well for Bangla and to address the problem of unsupervised word segmentation across different languages.From a hand-corrected Bangla corpus, 5000 popular words were segmented into suffixes, prefixes and roots manually.These were the sample lexicon used as seed for next step. A system was developed using C language to automate the Segmentation process based on hand made lexical database.The System was evaluated on several pages of Bangla text and achieved a success rate of about 83.05%.In our observation the system will work with full success if twice the volume of lexicon database and this system may have a huge impact particularly to learn and use Bangla for the people which will enhance their socio-economic life greatly.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Bangla, Natural Language Processing, Lexicon, suffixes, prefixes and roots, morphological segmentation |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Divisions: | School of Multimedia Technology & Communication |
Depositing User: | Dr. Ruzinoor Che Mat |
Date Deposited: | 05 Apr 2017 02:50 |
Last Modified: | 05 Apr 2017 02:50 |
URI: | https://repo.uum.edu.my/id/eprint/21406 |
Actions (login required)
View Item |