Text-to-SQL generation using schema item classifier and encoder-decoder architecture

dc.contributor.advisorUthayasanker, T
dc.contributor.authorRushdy, MSA
dc.date.accept2023
dc.date.accessioned2025-02-03T08:44:53Z
dc.date.available2025-02-03T08:44:53Z
dc.date.issued2023
dc.description.abstractThe objective of the text-to-SQL task is to convert natural language queries into SQL queries. However, the presence of extensive text-to-SQL datasets across multiple domains, such as Spider, introduces the challenge of effectively generalizing to unseen data. Existing semantic parsing models have struggled to achieve notable performance improvements on these cross-domain datasets. As a result, recent advancements have focused on leveraging pre-trained language models to address this issue and enhance performance in text-to-SQL tasks. These approaches represent the latest and most promising attempts to tackle the challenges associated with generalization and performance improvement in this field. I proposed an approach to evaluate and use the Seq2Seq model by giving the most relevant schema items as the input to the encoder and to generate accurate and valid cross-domain SQL queries using the decoder by understanding the skeleton of the target SQL query. The proposed approach is evaluated using Spider dataset which is a well-known dataset for text-to-sql task and able to get promising results where the Exact Match accuracy and Execution accuracy has been boosted to 72.7% and 80.2% respectively compared to other best related approaches. Keywords: Text-to-SQL, Seq2Seq model, BERT, RoBERTa, T5-Baseen_US
dc.identifier.accnoTH5304en_US
dc.identifier.citationRushdy, M.S.A. (2023). Text-to-SQL generation using schema item classifier and encoder-decoder architecture [Master's theses, University of Moratuwa]. Institutional Repository University of Moratuwa. http://dl.lib.uom.lk/handle/123/23424
dc.identifier.degreeMSc in Computer Scienceen_US
dc.identifier.departmentDepartment of Computer Science & Engineeringen_US
dc.identifier.facultyEngineeringen_US
dc.identifier.urihttp://dl.lib.uom.lk/handle/123/23424
dc.language.isoenen_US
dc.subjectTEXT-TO-SQL
dc.subjectSEQ2SEQ MODEL
dc.subjectBERT
dc.subjectT5-BASE
dc.subjectROBERTA
dc.subjectCOMPUTER SCIENCE & ENGINEERING – Dissertation
dc.subjectCOMPUTER SCIENCE- Dissertation
dc.subjectMSc in Computer Science
dc.titleText-to-SQL generation using schema item classifier and encoder-decoder architectureen_US
dc.typeThesis-Abstracten_US

Files

Original bundle

Now showing 1 - 3 of 3
Loading...
Thumbnail Image
Name:
TH5304-1.pdf
Size:
134.35 KB
Format:
Adobe Portable Document Format
Description:
Pre-text
Loading...
Thumbnail Image
Name:
TH5304-2.pdf
Size:
88.14 KB
Format:
Adobe Portable Document Format
Description:
Post-text
Loading...
Thumbnail Image
Name:
TH5304.pdf
Size:
1.11 MB
Format:
Adobe Portable Document Format
Description:
Full-thesis