Institutional-Repository, University of Moratuwa.  

Information extraction from Sri Lankan job advertisements via rule-based approach

Show simple item record Bandara, RMHD Gunasekara, HASS Peiris, WADS Wijekoon, WMHC De Silva, TS Hewawalpita, SGS Rathnayake, HMSC 2021-12-07T06:24:30Z 2021-12-07T06:24:30Z 2021-12-03
dc.description.abstract One of the major problems in the Sri Lankan labour market is the lack of availability of demand side information. This lack of information has created a gap between supply and demand of labour. Job advertisements provide a wide range of real-time information about aspects, such as skills and qualifications, that are in demand, though this information is largely unstructured and exists in many different formats. The objective of this research is to create a structured dataset of job vacancies in Sri Lanka using publicly available job advertisements. A total of 3500 images of job advertisements were scraped from Sri Lankan English newspapers and job websites and converted into text form using Optical Character Recognition (OCR). Next, a structured dataset was created by extracting information, applying a rule-based approach in the Natural Language Processing (NLP) domain, after which some basic insights on the labour market were derived. The creation of this kind of dataset could provide huge value to employers, job seekers and policymakers, providing up-to-date information on the skills and qualifications required in the job market. en_US
dc.language.iso en en_US
dc.publisher Business Research Unit (BRU)
dc.subject NLP en_US
dc.subject OCR en_US
dc.subject Information Extraction en_US
dc.subject Job advertisements en_US
dc.subject Labour market intelligence en_US
dc.title Information extraction from Sri Lankan job advertisements via rule-based approach en_US
dc.type Conference-Full-text en_US
dc.identifier.faculty Business en_US
dc.identifier.year 2021 en_US
dc.identifier.conference International Conference on Business Research en_US Moratuwa en_US
dc.identifier.pgnos pp. 143-152 en_US
dc.identifier.proceeding 4th International Conference on Business Research - ICBR 2021 en_US en_US en_US en_US en_US en_US en_US en_US

Files in this item

This item appears in the following Collection(s)

Show simple item record