A Bibliographic Data Mining Framework to Recognize Research Excellence

dc.contributor.authorSeneviratne, TM
dc.contributor.authorAriyasinghe, N
dc.contributor.authorJayawardena, CL
dc.contributor.authorJayasekara, AGBP
dc.contributor.authorGopura, RARC
dc.date.accessioned2025-11-12T06:13:30Z
dc.date.issued2025
dc.description.abstractSri Lankan university libraries are actively involved in evaluating the research performance of higher educational institutes, relying on publication data-driven approaches. The National Research Council (NRC) of Sri Lanka also evaluates publications and patents based on Scopus data without inviting individual applications for a scrutiny procedure. This study presents an initiative by the University Library and the Senate Research Committee (SRC) of the University of Moratuwa to introduce a multi-source data mining framework based on publication metrics to identify research excellence. Bibliometric data sources, such as Scopus, Scimago, Web of Science Core Collections, and Google's top 20 subcategory rankings, have been utilized to develop a semi-automated scoring system. A total of 639 publication records extracted from an affiliation search for the calendar year 2023 on the Scopus database, were subject to manual data mining. From the identified departmental - level publications, 383 individual - level publications were retrieved and the information was converted into text files. A Python script was used to map the Scimago Journal Rank, Web of Science Core Collection, and Top 20 Google Scholar metrics to categorize them according to preestablished evaluation criteria provided by the SRC. The datasets were cross-referenced with current quality indicators, including Scimago journal quartile (Q1–Q4), Google Scholar rankings, and Web of Science indexing (AHCI, SCIE, SSCI, ESCI). After adding binary flags and ranking categories, publications were assigned a code system based on quality metrics. Award categories were assigned to individuals based on their publication scores and consistency in quality, where the results were compiled into an MS Excel file for reporting. The analysis identified 17 recipients for the Vice-Chancellor’s Award and 142 for Outstanding Research Awards, of which 62 were Distinctions. It was also found that 68% of the articles are in the Q1 quartile and 17% in the Q2 quartile; 49.78% included in SCIE and 31.72% in ESCI. A replicable, semi-automatic framework developed to assess the quality of research publications from large datasets can be applied at different scales to provide a comprehensive evaluation that surpasses any single-platform approach that is currently in use. This novel approach can be utilized to identify and reward the high-calibre research without requiring an individual application/submission process. It has potential for institutional research and cross-collaboration assessments intra-departments, inter-departments, inter-faculty, local and international), for resource allocation, and strategic planning beyond the research excellence.
dc.identifier.conference15th International Conference of the University Librarians Association of Sri Lanka
dc.identifier.emailthusharims@uom.lk
dc.identifier.urihttps://dl.lib.uom.lk/handle/123/24380
dc.language.isoen
dc.subjectPublication Evaluations
dc.subjectBibliographic Information
dc.subjectData Mining
dc.titleA Bibliographic Data Mining Framework to Recognize Research Excellence
dc.typeConference-Abstract

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Seneviratne et al 2025.pdf
Size:
260.23 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: