Clustering sinhala news articles using corpus- based similarity measures

Nanayakkara, P; Ranathunga, S

Clustering sinhala news articles using corpus- based similarity measures

Date

2018-05

Authors

Nanayakkara, P

Ranathunga, S

Publisher

IEEE

Abstract

News aggregators help readers to handle large numbers of news items in a convenient manner by collecting them into a single place with meaningful groupings. Such news aggregators/clusters are available for English and some other popular languages. However, no such tools are available for Sinhala language. To address this void, this paper presents a system to collect news articles published across the web and group related articles using corpus-based similarity measures. Despite the simplicity of the technique and morphological richness of Sinhala, we achieved very promising results that prove the viability of the presented technique.

Keywords

document clustering, Corpus-based similarity measurement, Sinhala

Citation

P. Nanayakkara and S. Ranathunga, "Clustering Sinhala News Articles Using Corpus-Based Similarity Measures," 2018 Moratuwa Engineering Research Conference (MERCon), 2018, pp. 437-442, doi: 10.1109/MERCon.2018.8421890.

URI

http://dl.lib.uom.lk/handle/123/18670

DOI

10.1109/MERCon.2018.8421890

Collections

MERCon - 2018

Full item page

Clustering sinhala news articles using corpus- based similarity measures

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

DOI

Collections

Endorsement

Review

Supplemented By

Referenced By