24x7 Service; AnyTime; AnyWhere

A Discourse-Based Information Retrieval for Tamil Literary Texts

Ramalingam, Anita and Navaneethakrishnan, Subalalitha Chinnaudayar (2021) A Discourse-Based Information Retrieval for Tamil Literary Texts. Journal of Information and Communication Technology, 20 (03). pp. 353-389. ISSN 2180-3862

[thumbnail of JICT 20 03 2021 353-389.pdf] PDF - Published Version
Restricted to Registered users only

Download (1MB) | Request a copy


Tamil literature has many valuable thoughts that can help the human community to lead a successful and a happy life. Tamil literary works are abundantly available and searched on the World Wide Web (WWW), but the existing search systems follow a keyword-based match strategy which fails to satisfy the user needs. This necessitates the demand for a focused Information Retrieval System that semantically analyses the Tamil literary text which will eventually improve the search system performance. This paper proposes a novel Information Retrieval framework that uses discourse processing techniques which aids in semantic analysis and representation of the Tamil Literary text. The proposed framework has been tested using two ancient literary works, the Thirukkural and Naladiyar, which were written during 300 BCE. The Thirukkural comprises 1330 couplets, each 7 words long, while the Naladiyar consists of 400 quatrains, each 15 words long. The proposed system, tested with all the 1330 Thirukkural couplets and 400 Naladiyar quatrains, achieved a mean average precision (MAP) score of 89%. The performance of the proposed framework has been compared with Google Tamil search and a keyword-based search which is a substandard version of the proposed framework. Google Tamil search achieved a MAP score of 56% and keyword-based method achieved a MAP score of 62% which shows that the discourse processing techniques improves the search performance of an Information Retrieval system.

Item Type: Article
Additional Information: Printed by UUM Press
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: College of Arts and Sciences
Depositing User: Mrs Nurin Jazlina Hamid
Date Deposited: 31 Jul 2022 07:57
Last Modified: 31 Jul 2022 07:57

Actions (login required)

View Item View Item