Institutional-Repository, University of Moratuwa.  

Information extraction from Sri Lankan job advertisements via rule-based approach

Show simple item record

dc.contributor.author Bandara, RMHD
dc.contributor.author Gunasekara, HASS
dc.contributor.author Peiris, WADS
dc.contributor.author Wijekoon, WMHC
dc.contributor.author De Silva, TS
dc.contributor.author Hewawalpita, SGS
dc.contributor.author Rathnayake, HMSC
dc.date.accessioned 2021-12-07T06:24:30Z
dc.date.available 2021-12-07T06:24:30Z
dc.date.issued 2021-12-03
dc.identifier.uri http://dl.lib.uom.lk/handle/123/16859
dc.description.abstract One of the major problems in the Sri Lankan labour market is the lack of availability of demand side information. This lack of information has created a gap between supply and demand of labour. Job advertisements provide a wide range of real-time information about aspects, such as skills and qualifications, that are in demand, though this information is largely unstructured and exists in many different formats. The objective of this research is to create a structured dataset of job vacancies in Sri Lanka using publicly available job advertisements. A total of 3500 images of job advertisements were scraped from Sri Lankan English newspapers and job websites and converted into text form using Optical Character Recognition (OCR). Next, a structured dataset was created by extracting information, applying a rule-based approach in the Natural Language Processing (NLP) domain, after which some basic insights on the labour market were derived. The creation of this kind of dataset could provide huge value to employers, job seekers and policymakers, providing up-to-date information on the skills and qualifications required in the job market. en_US
dc.language.iso en en_US
dc.publisher Business Research Unit (BRU)
dc.subject NLP en_US
dc.subject OCR en_US
dc.subject Information Extraction en_US
dc.subject Job advertisements en_US
dc.subject Labour market intelligence en_US
dc.title Information extraction from Sri Lankan job advertisements via rule-based approach en_US
dc.type Conference-Full-text en_US
dc.identifier.faculty Business en_US
dc.identifier.year 2021 en_US
dc.identifier.conference International Conference on Business Research en_US
dc.identifier.place Moratuwa en_US
dc.identifier.pgnos pp. 143-152 en_US
dc.identifier.proceeding 4th International Conference on Business Research - ICBR 2021 en_US
dc.identifier.email harini.17@business.mrt.ac.lk en_US
dc.identifier.email suwani.17@business.mrt.ac.lk en_US
dc.identifier.email diluni.17@business.mrt.ac.lk en_US
dc.identifier.email himali.17@business.mrt.ac.lk en_US
dc.identifier.email tilokad@uom.lk en_US
dc.identifier.email supungs@uom.lk en_US
dc.identifier.email samadhic@uom.lk en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record