Abstract:
News aggregators help readers to handle large
numbers of news items in a convenient manner by collecting them
into a single place with meaningful groupings. Such news
aggregators/clusters are available for English and some other
popular languages. However, no such tools are available for
Sinhala language. To address this void, this paper presents a
system to collect news articles published across the web and group
related articles using corpus-based similarity measures. Despite
the simplicity of the technique and morphological richness of
Sinhala, we achieved very promising results that prove the
viability of the presented technique.
Citation:
P. Nanayakkara and S. Ranathunga, "Clustering Sinhala News Articles Using Corpus-Based Similarity Measures," 2018 Moratuwa Engineering Research Conference (MERCon), 2018, pp. 437-442, doi: 10.1109/MERCon.2018.8421890.