Automatic code generation from graphical user interface (GUI) images
Loading...
Date
2024
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
In today's software landscape, Graphical User Interfaces (GUIs) are crucial for improving user experience. As the demand for user-friendly software applications grows, the development of GUI-based software becomes more complex. This rapid growth of GUIs demands efficient methods for translating design representations into functional code. This thesis explores the novel approach of employing an Image Captioning model to automatically generate source code from Graphical User Interface (GUI) images. The openly available Pix2Code dataset, which encompasses screenshots of GUIs and their corresponding DSL code for Android, iOS, and web platforms, is utilized. The proposed model employs a standard encoder-decoder architecture. It integrates a ResNet152 model as the image encoder and an LSTM model as the text decoder to convert GUI images into human-readable DSL code sequences. The selection of the ResNet152 model is a strategic choice driven by its exceptional depth and complexity. The architectural design of ResNet152 enables it to adeptly capture the complex visual characteristics found within GUI images, making it well-suited for the task of code synthesis. GUI images often contain complex design elements such as buttons, text fields, and menus, which require a nuanced understanding to accurately translate into code. The depth of ResNet152 enables it to capture these intricate details effectively, contributing to the precision and accuracy of the code generation process. This thesis also investigates the choice of decoding algorithm, opting for a greedy approach over alternatives like beam search. This decision is informed by the need for simplicity and efficiency in the code generation process. Model's performance is assessed using BLEU scores, demonstrating a noteworthy achievement with a score of 0.78. This thesis provides an in-depth exploration of the technical aspects, including model architecture and training processes, along with insights into encountered challenges. This study significantly contributes to the field of automated code generation by introducing an efficient method for translating GUI images into executable code. The implications of this work extend to streamlining software development processes, reducing manual coding efforts, and enhancing overall productivity. The experimental results showcase the model's proficiency in capturing complex GUI designs and generating accurate source code snippets, paving the way for potential applications in GUI-based software development methodologies. This study contributes to the evolving field of automatic code generation and suggests possibilities for future enhancements in code generation for various platforms.
Description
Keywords
COMPUTER GRAPHICS-Graphical User Interfaces, CONVOLUTIONAL NEURAL NETWORKS, DEEP LEARNING-Long Short-Term Memory, COMPUTER PROGRAMMING LANGUAGES-Domain Specific Languages, NATURAL LANGUAGE PROCESSING-Bilingual Evaluation Understudy, COMPUTER SCIENCE AND ENGINEERING-Dissertation, MSc in Computer Science
Citation
Sabthavi, J. (2024). Automatic code generation from graphical user interface (GUI) images [Master’s theses, University of Moratuwa]. Institutional Repository University of Moratuwa. https://dl.lib.uom.lk/handle/123/23755
