Bayesian Nonparametric Probabilistic Methods in Machine Learning
Abstract
Abstract
Many aspects of modern science, business and engineering have become data-centric, relying
on tools from Artificial Intelligence and Machine Learning. Practitioners and researchers in
these fields need tools that can incorporate observed data into rich models of uncertainty to
make discoveries and predictions. One area of study that provides such models is the field of
Bayesian Nonparametrics. This dissertation is focused on furthering the development of this
field.
After reviewing the relevant background and surveying the field, we consider two areas of
structured data:
- We first consider relational data that takes the form of a 2-dimensional array—such as
social network data. We introduce a novel nonparametric model that takes advantage
of a representation theorem about arrays whose column and row order is unimportant.
We then develop an inference algorithm for this model and evaluate it experimentally.
- Second, we consider the classification of streaming data whose distribution evolves over
time. We introduce a novel nonparametric model that finds and exploits a dynamic
hierarchical structure underlying the data. We present an algorithm for inference in
this model and show experimental results. We then extend our streaming model to
handle the emergence of novel and recurrent classes, and evaluate the extended model
experimentally.