Privacy preserving data publishing framework for unstructured textual social media data

Abeywardana PBPA

UoM IR
→
Thesis & Dissertation
→
Faculty of Engineering, Computer Science & Engineering
→
Master of Science in Computer science and Engineering
→
View Item

dc.contributor.advisor	Uthyasanker T
dc.contributor.author	Abeywardana PBPA
dc.date.accessioned	2020
dc.date.available	2020
dc.date.issued	2020
dc.identifier.uri	http://dl.lib.uom.lk/handle/123/16486
dc.description.abstract	Privacy has become an essential part of data science and analytics due to the potential of personal data misuse. As a result of privacy breaches reported in various analytical studies privacy preservation has become a legal responsibility rather than a simple social responsibility. Preserving privacy of unstructured data is more challenging compared to structured data. Social media has become largely popular over the past couple of decades and they are pumping a huge amount of data at a high velocity into analytical systems. Social media profiles contain a wealth of personal and sensitive information, creating enormous opportunities for third parties to analyze them with different algorithms, draw conclusions and use in disinformation campaigns and micro targeting based dark advertising. The primary goal of this study is to provide a mitigation mechanism for privacy breaches happening via disinformation campaigns that are done based on the insights extracted from personal/sensitive data analysis. Specifically, this research is aimed at building a privacy preserving data publishing framework for unstructured and textual social media data without compromising the true analytical value of those data. A novel way is proposed to apply traditional structured privacy preserving techniques on unstructured data. Creating a comprehensive twitter corpus annotated with privacy attributes is another objective of this research, especially because the research community is lacking one. An easily extensible framework that can be adopted by many domains is implemented here, integrating different concepts from the literature. A comprehensive set of experiments are also performed in order to assess the capabilities of the machine learning models, algorithms as well as to simulate some real-world privacy preserving data publishing use cases.	en_US
dc.language.iso	en	en_US
dc.subject	COMPUTER SCIENCE- Dissertation	en_US
dc.subject	COMPUTER SCIENCE & ENGINEERING - Dissertation	en_US
dc.subject	SOCIAL MEDIA - Twitter	en_US
dc.subject	UNSTRUCTURED TEXTUAL DATA	en_US
dc.subject	DATA - Privacy	en_US
dc.subject	TWITTER	en_US
dc.title	Privacy preserving data publishing framework for unstructured textual social media data	en_US
dc.type	Thesis-Full-text	en_US
dc.identifier.faculty	Engineering	en_US
dc.identifier.degree	MSc in Computer Science and Engineering	en_US
dc.identifier.department	Department of Computer Science and Engineering	en_US
dc.date.accept	2020
dc.identifier.accno	TH4287	en_US