A Novel aspect taxonomy and aspect extraction methodology for scholarly book reviews

dc.contributor.advisorRanathunga S
dc.contributor.authorBasuru WWACS
dc.date.accept2019
dc.date.accessioned2019
dc.date.available2019
dc.date.issued2019
dc.description.abstractMany people decide on the quality of a product based on its online reviews, which is also the most commonly used method when purchasing books from online book stores. Compared to other products, a scholarly book is one of the most difficult products to purchase online since customers have limited access to its internal content. Therefore, a customer has to go through multiple reviews in order to get insight on the book. However, the sheer volume of online reviews makes it difficult for a human to process and extract all the meaningful information in order to make an educated purchase. As a result, a requirement for a sentiment analysis system for scholarly book reviews are much needed at this stage. A more accurate opinion of the book can be obtained through aspect-based summarization. This type of summarization of opinions is critical for scholarly book reviews since content, organization, and other features interpret whether the book can be recommended to a customer at a certain education level. Compared to sentiment analysis on reviews of products/services such as movies or restaurants, there is no well-defined research in aspect extraction or aspect-based sentiment analysis of scholarly book reviews. Not surprisingly for this domain, there is no well-defined aspect taxonomy or an annotated dataset available to extract aspects or to identify aspect categories. Compared to other domains, identifying aspects of book reviews is difficult since aspects such as the quality of the book or the discussed topics always appear implicitly in reviews. The main contribution of this research is to identify potential aspects and an aspect taxonomy for scholarly book reviews. We also present a (1.) dependency rule-based unsupervised model for aspect extraction, which works better than state-of-the-art unsupervised methods, and (2.) a clustering-based aspect category identification method. Both of these are important first steps for aspect-based sentiment analysis. The aspect taxonomy for scholarly book reviews is a hierarchical model. Book and Author have been identified as the first level of the taxonomy. Readability, content, worthiness and price, are the next level of aspect taxonomy under the book aspect category. Author expertise has been identified as an aspect category under author. In order to validate the aspect taxonomy, an unsupervised aspect extraction and clustering algorithm is proposed. An existing dependency rule-based aspect extraction algorithm is improved by adding new rules that extract aspects from book reviews. Two existing clustering algorithms for aspect clustering are merged to obtain a new clustering algorithm to discover the categories of aspect terms. The clustering algorithm is able to find the semantic similarity of aspect terms, while considering the sharing words between aspect terms, and groups similar aspects in to a one cluster. After successfully generating an annotated corpus for the scholarly book reviews in the computer science domain with Cohen’s kappa statistics of 0.76, the dependency rule-based aspect extractor was able to extract both implicit and explicit aspects with precision 76.04%, recall 75.99% and overall F1-score 76.02%. The proposed semantic similarity based aspect clustering algorithm identifies the aspect in the following categories; book, author, readability, content, worthiness, price and author expertise with rand-index 14.41%, V-measure 36.29%, homogeneity 66.18% and completeness 25%.en_US
dc.identifier.accnoTH4097en_US
dc.identifier.citationBasuru, W.W.A.C.S. (2019). A Novel aspect taxonomy and aspect extraction methodology for scholarly book reviews [Master’s theses, University of Moratuwa]. Institutional Repository University of Moratuwa. http://dl.lib.mrt.ac.lk/handle/123/16180
dc.identifier.degreeMSc in Computer Science and Engineeringen_US
dc.identifier.departmentDepartment of Computer Science & Engineeringen_US
dc.identifier.facultyEngineeringen_US
dc.identifier.urihttp://dl.lib.mrt.ac.lk/handle/123/16180
dc.language.isoenen_US
dc.subjectCOMPUTER SCIENCE AND ENGINEERING-Dissertationsen_US
dc.subjectCOMPUTER SCIENCE-Dissertationsen_US
dc.subjectSENTIMENT ANALYSISen_US
dc.subjectASPECT TAXONOMYen_US
dc.subjectCLUSTERING ALGORITHMSen_US
dc.subjectSCHOLARLY PUBLISHING-Book Reviewsen_US
dc.subjectBOOK REVIEWSen_US
dc.titleA Novel aspect taxonomy and aspect extraction methodology for scholarly book reviewsen_US
dc.typeThesis-Full-texten_US

Files

Original bundle

Now showing 1 - 3 of 3
Loading...
Thumbnail Image
Name:
TH4097-1.pdf
Size:
185.37 KB
Format:
Adobe Portable Document Format
Description:
Pre-text
Loading...
Thumbnail Image
Name:
TH4097-2.pdf
Size:
209.51 KB
Format:
Adobe Portable Document Format
Description:
Post-text
Loading...
Thumbnail Image
Name:
TH4097.pdf
Size:
1.09 MB
Format:
Adobe Portable Document Format
Description:
Full-thesis