Study in Big Data Harnessing and Related Problems

dc.contributor.advisorWu, Weili
dc.contributor.advisorMa, Dongsheng Brian
dc.contributor.committeeMemberKhan, Latifur
dc.contributor.committeeMemberGuo, Xiaohu
dc.contributor.committeeMemberBastani, Farokh B.
dc.creatorJin, Rong
dc.date.accessioned2023-08-22T18:51:08Z
dc.date.available2023-08-22T18:51:08Z
dc.date.created2021-08
dc.date.issued2021-08-01T05:00:00.000Z
dc.date.submittedAugust 2021
dc.date.updated2023-08-22T18:51:09Z
dc.description.abstractSocial networks, such as Facebook and Twitter, have provided incredible opportunities for social communication between web users around the world. Social network analysis is an important problem in data harnessing. The analysis of social networks helps summarizing the interests and opinions of users (nodes), discovering patterns from the interactions (edges) between users, and mining the events that take place in online platforms. The information obtained by analyzing social networks could be especially valuable for many applications. Some typical examples include online advertisement targeting, viral marketing, personalized recommendation, health social media, social influence analysis, and citation network analysis. In this dissertation, we study two types of applications emerging from modern online social platforms in the view of social influence. One is influence maximization(IM) problem from a discount-based online viral marketing scenario, which aims at maximizing influence in the adoption of target products, and the other is online rumor source detection problem, in which the spread of misinformation is supposed to be minimized and the source is expected to be detected. We formulate them as set function optimization problems and design solutions with performance guarantees. In study of set function optimization, there is a challenge coming from the submodularity of objective function. That is, some of the practical problems are not submodular or supermodular, the existing greedy strategy cannot be directly applied to problems to get a guaranteed approximate solution. To solve those non-submodular and non-supermodular problems, one method called DS decomposition has been considered, in which given a set function, we decompose it to be representable as a difference between submodular functions. Based on this method, we further study a problem about how to find a DS decomposition efficiently and effectively. Then we propose a generalized framework that is made up of our novel algorithms under deterministic version and random version respectively to solve maximization of DS decomposition and show their performances under various combinatorial settings. In addition, we discuss our findings on the role of black-box, that has been an important component in study of computational complexity theory as well as has been used for establishing the hardness of problems, about its implied power and limitations in study of data-driven computation for proving solutions to some computational problems.
dc.format.mimetypeapplication/pdf
dc.identifier.uri
dc.identifier.urihttps://hdl.handle.net/10735.1/9775
dc.language.isoen
dc.subjectComputer Science
dc.titleStudy in Big Data Harnessing and Related Problems
dc.typeThesis
dc.type.materialtext
thesis.degree.collegeSchool of Engineering and Computer Science
thesis.degree.departmentComputer Science
thesis.degree.grantorThe University of Texas at Dallas
thesis.degree.namePHD

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
JIN-PRIMARY-2022-1.pdf
Size:
2.27 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.84 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
1.84 KB
Format:
Plain Text
Description: