Inference Guided Learning of Probabilistic Models
Probabilistic graph-based models such as Bayesian and Markov networks are used to represent and reason about uncertainty in many real-world domains. Since most inference or reasoning tasks over them are NP-hard in general, the following two strategies are used in practice to combat the intractability of exact inference. First, a potentially intractable model is learned from data and then polynomial time approximate inference algorithms (e.g., belief propagation, Gibbs sampling, etc.) are used at inference time to trade accuracy with computational complexity. Second, strong constraints are imposed on the models (e.g., tree structure) at learning time such that exact inference is tractable and then exact algorithms are used at inference time. Inspired by the work in the tractable probabilistic models community, who use the latter approach, in this dissertation, we propose algorithms that learn models on which exact or approximate inference (or both) are computationally efficient as well as accurate. We call such algorithms inference guided learning (IGL) algorithms. To date, we have developed four novel IGL algorithms, investigated their theoretical properties and empirically evaluated their practical performance. Our first algorithm induces a cutset network, a probabilistic model that admits linear time, full maximum-a-posteriori (MAP) inference in addition to linear time posterior marginal (MAR) inference. This algorithm alleviates the following shortcoming in existing work on tractable probabilistic models: the learned models are such that exact MAR inference is tractable but full MAP inference is not and as a result, they exhibit poor performance for the latter query type. Our second algorithm learns more accurate discriminative or conditional cutset networks (CCNs) from data. These networks yield more accurate answers to full MAP and MAR queries under the assumption that a fixed subset of variables in the application domain is always observed. Our third algorithm induces probabilistic models on which Rao-Blackwellised importance sampling, a popular simulation-based inference scheme is likely to perform well. Finally, our fourth algorithm focuses on using local information to improve the quality of tractable models. Unlike global information, local information is available in plenty but is susceptible to noise and therefore our proposed method filters noise in a principled manner.