An Answer Set Programming Based Approach to Representing and Querying Textual Knowledge
Pendharkar, Dhruva R.
MetadataShow full item record
Knowledge Representation and Reasoning (KR&R) is a field of Artificial Intelligence that deals with converting information into knowledge in a form that the computer can process. It applies concepts from the field of psychology, about how humans make rational decisions, to build formal rules that model the human cognitive processes. Using the generated knowledge bases, the computer is then able to solve complex tasks like question answering, summarization, automated reasoning, medical diagnosis and many more. Many of these complex tasks, mentioned above, require an understanding of natural language text. A vast amount of knowledge that we have today comes from books and is in the form of natural language text. Such knowledge is in an unstructured form and is not easily interpretable by computers. An approach based on answer set programming (ASP) is proposed in this thesis for representing knowledge generated from natural language text. This knowledge is then used to perform reasoning with the help of advanced implementations of ASP such as s(ASP). ASP representation of techniques such as default reasoning, hierarchical knowledge organization, negation as failure, etc., are used to model common-sense reasoning methods required to accomplish this task. Automation of the question answering task has been used in this thesis to demonstrate the effectiveness of our ASP-based KR&R techniques. The automated Q & A system developed as part of this thesis parses and converts natural language text to an ASP knowledge base. Users can pose questions in a natural language that are parsed and converted into ASP queries automatically. These queries are next solved against the knowledge base obtained from the natural language text augmented with related, auxiliary knowledge obtained from other resources such as WordNet. In contrast to approaches based on machine learning, our system answers questions based on actually understanding the text. This approach has been tested on the SQuAD dataset and the results are promising.