Using Machine Learning Techniques for Prediction and Data Generation with Applications to Data Privacy

Abay, Nazmiye Ceren

Using Machine Learning Techniques for Prediction and Data Generation with Applications to Data Privacy

dc.contributor.ORCID	0000-0002-7930-3455 (Abay, NC)
dc.contributor.advisor	Thuraisingham, Bhavani
dc.contributor.advisor	Kantarcioglu, Murat
dc.creator	Abay, Nazmiye Ceren
dc.date.accessioned	2020-12-10T19:55:09Z
dc.date.available	2020-12-10T19:55:09Z
dc.date.created	2019-12
dc.date.issued	2019-12
dc.date.submitted	December 2019
dc.date.updated	2020-12-10T19:55:10Z
dc.description.abstract	Increasingly, machine learning (ML) applications are developed and become an integral part of many real-world applications. Especially, ML techniques are heavily used in research and industry to help make effective decisions. Despite the apparent recent success of ML techniques, there exist some domain-specific challenges that require in-depth investigations with respect to predictive accuracy, privacy protection and cybersecurity. In this dissertation, we start with understanding the usability of ML techniques in the cryptocurrency transaction domain (e.g., Bitcoin) where there is no privacy concern (i.e., all Bitcoin transaction information is public) and show how to use ML techniques to make better predictions in real-time. For application domains that involve sensitive data, collecting, sharing and refining of these sensitive data may raise serious privacy concerns. To address these concerns, we propose a privacy preserving synthetic data generation technique that leverages deep learning. The proposed technique allows participants to share the synthetic datasets freely without worrying about the individual privacy. Furthermore, we compare our proposed technique with the existing synthetic data generation algorithms, and investigate the utility of these algorithms under different use cases. Finally, we explore the usage of the generated synthetic data to improve the cybersecurity posture of the organizations. Basically, we show that the generated synthetic data not only protect individual privacy but can be used to deceive (i.e., the synthetic data is indistinguishable from the real data) the potential cyberattackers. This in return could be used to reduce sensitive data leakage under successful cyberattacks where an attacker could be deceived to target synthetic data instead of the real, and sensitive data.
dc.description.sponsorship	NIH award 1R01HG006844, NSF awards CICI-1547324 and IIS-1633331
dc.format.mimetype	application/pdf
dc.identifier.uri	https://hdl.handle.net/10735.1/9091
dc.language.iso	en
dc.rights	©2019 Nazmiye Ceren Abay. All rights reserved.
dc.subject	Cryptocurrencies
dc.subject	Computer security
dc.subject	Machine learning
dc.subject	Artificial intelligence
dc.title	Using Machine Learning Techniques for Prediction and Data Generation with Applications to Data Privacy
dc.type	Dissertation
dc.type.material	text
thesis.degree.department	Computer Science
thesis.degree.grantor	The University of Texas at Dallas
thesis.degree.level	Doctoral
thesis.degree.name	PHD

Files

Original bundle

Now showing 1 - 1 of 1

Name:: ETD-5608-011D-262384.21.pdf
Size:: 993.38 KB
Format:: Adobe Portable Document Format
Description:: Dissertation

Download

License bundle

Now showing 1 - 2 of 2

Name:: PROQUEST_LICENSE.txt
Size:: 5.84 KB
Format:: Plain Text
Description:

Download

Name:: LICENSE.txt
Size:: 1.84 KB
Format:: Plain Text
Description:

Download

Collections

UTD Theses and Dissertations