Structuring the knowledge for systematic information retrieval - knowledge graph and machine learning approach
| dc.contributor.advisor | Ambegoda, T | |
| dc.contributor.author | Ahamed, MFS | |
| dc.date.accept | 2025 | |
| dc.date.accessioned | 2026-02-09T09:06:56Z | |
| dc.date.issued | 2025 | |
| dc.description.abstract | The COVID-19 pandemic has led to the publication of a massive amount of research papers, making it hard for researchers to find relevant information quickly. This study aims to solve this problem by using knowledge graphs to organize and analyze data from the Kaggle CORD-19 dataset and AWS metadata. Over 401,270 PDF and 315,742 PMC JSON files were processed, supported by millions of metadata connections. Knowledge graphs were created to show relationships between topics, countries, institutions, authors, concepts, and sentiment scores, allowing researchers to explore the data in multiple ways. A BERT-based sentiment analysis model was used to assign sentiment scores to papers, adding 32,299 new connections to the graph. These scores grouped papers based on similar tones and emotions, helped to uncover hidden patterns and trends. By integrating these insights into a combined knowledge graph, researchers can now traverse connections across metadata properties such as authors, institutions, topics, or sentiment scores, broadening the scope of discovery within the CORD-19 dataset. Visualizations showed how papers are connected to different metadata properties, such as the countries where research originated, the institutions involved, and overlapping research themes. Concept graphs included confidence scores to show how strongly a paper is linked to a concept. Sentiment graphs added new layers of connections that go beyond traditional metadata. Statistics highlight the size and complexity of these graphs, with 453,633 country edges, 476,865 institutional edges, and 1,783,589 concept edges. Also, average connectivity per node increases after adding sentiment score to the knowledge graph. This study shows that knowledge graphs are a powerful way to organize and explore large collections of research papers. Adding sentiment analysis improves the depth of analysis, making it easier to find valuable information and uncover new insights. This method can be applied to other fields in the future, providing a strong tool for solving global challenges by organizing and analyzing large datasets. | |
| dc.identifier.accno | TH5996 | |
| dc.identifier.citation | Ahamed, M.F.S. (2025). Structuring the knowledge for systematic information retrieval - knowledge graph and machine learning approach [Master’s theses, University of Moratuwa]. Institutional Repository University of Moratuwa. https://dl.lib.uom.lk/handle/123/24823 | |
| dc.identifier.degree | MSc in Computer Science | |
| dc.identifier.department | Department of Computer Science & Engineering | |
| dc.identifier.faculty | Engineering | |
| dc.identifier.uri | https://dl.lib.uom.lk/handle/123/24823 | |
| dc.language.iso | en | |
| dc.subject | KNOWLEDGE GRAPHS | |
| dc.subject | SEMANTIC NETWORKS | |
| dc.subject | MACHINE LEARNING | |
| dc.subject | INFORMATION RETRIEVAL | |
| dc.subject | COVID-19 OPEN RESEARCH DATASET | |
| dc.subject | SENTIMENT ANALYSIS | |
| dc.subject | COMPUTER SCIENCE-Dissertation | |
| dc.subject | COMPUTER SCIENCE AND ENGINEERING-Dissertation | |
| dc.subject | MSc in Computer Science | |
| dc.title | Structuring the knowledge for systematic information retrieval - knowledge graph and machine learning approach | |
| dc.type | Thesis-Full-text |
Files
Original bundle
1 - 3 of 3
Loading...
- Name:
- TH5996-1.pdf
- Size:
- 993.22 KB
- Format:
- Adobe Portable Document Format
- Description:
- Pre-text
Loading...
- Name:
- TH5996-2.pdf
- Size:
- 311.29 KB
- Format:
- Adobe Portable Document Format
- Description:
- Post-text
Loading...
- Name:
- TH5996.pdf
- Size:
- 4.08 MB
- Format:
- Adobe Portable Document Format
- Description:
- Full-thesis
License bundle
1 - 1 of 1
Loading...
- Name:
- license.txt
- Size:
- 1.71 KB
- Format:
- Item-specific license agreed upon to submission
- Description:
