Efficient Combination of Neural and Symbolic Learning for Relational Data
Date
Authors
item.page.orcid
Journal Title
Journal ISSN
Volume Title
Publisher
item.page.doi
Abstract
Much has been achieved in AI but to realize its true potential, it is imperative that the AI system should be able to learn generalizable and actionable higher-level knowledge from lowest level percepts. Inspired by this goal, neuro-symbolic systems have been developed for the past four decades. These systems encompass the complementary strengths of fast adaptive learning of neural networks from low-level input signals and the deliberative, generalizable models of the symbolic systems. The advent of deep networks has accelerated the development of these neuro-symbolic systems. While successful, there are several open problems to be addressed in these systems, a few of which we tackle in this dissertation. These include: (i) several primitive neural network architectures have not been well studied in the symbolic context; (ii) lack of generic neuro-symbolic architectures that are do not make distributional assumptions; (iii) generalization abilities of many such systems are limited. The objective of this dissertation is to develop novel neuro-symbolic models that (i) induce symbolic reasoning capabilities to fundamental yet unexplored neural network architectures, and (ii) provide unique solutions to the generalization issues that occur during neuro-symbolic integration. More specifically, we consider one of the primitive models, Restricted Boltzmann Machines, that was originally employed for pre-training the deep neural networks and propose two unique solutions to lift them for relational model. For the first solution, we employ relational random walks to generate relational features for Boltzmann machines. We train the Boltzmann machines by passing these resulting features through a novel transformation layer. For the second solution, we employ the mechanism of functional gradient boosting to learn the structure and the parameters of the lifted Restricted Boltzmann Machines simultaneously. Next, most of the neuro-symbolic models designed till date have focused on incorporating neural capabilities in specific models, resulting in lack of a general relational neural network architecture. To overcome this, we develop a generic neuro-symbolic architecture that exploits the concept of relational parameter tying and combining rules to incorporate the first-order logic rules into the hidden layers of the proposed architecture. One of the prevalent neuro-symbolic models called knowledge graph embedding models encode the symbols as learnable vectors in Euclidean space and lose an important characteristic of generalizability to newer symbols while doing so. We propose two unique solutions to circumvent this problem by exploiting the text description of entities in addition to the knowledge graph triples in both the models. In our first model, we train both the text and knowledge graph data in generative setting, while in the second model, we posit the two data sources in adversarial setting. Our broad results across these several directions demonstrate the efficacy and efficiency of the proposed approaches on benchmarks and novel data sets. In summary, this dissertation takes one of the first steps towards realizing the grand vision of the neuro-symbolic integration by proposing novel models that allow for symbolic reasoning capabilities inside neural networks.