Neural Network Models for Text Understanding
Text understanding is a key component of Natural Language Processing (NLP) and Artificial Intelligence (AI). The traditional approaches to solve this problem are by joining the features selected from sentences and the relevant world knowledge together to make further inference. However, such feature based methods not only require a great deal of human effort, but also can only represent the lexical and syntactic information of text and the semantic meanings of text cannot be well represented. Hence, we present novel neural language models that represent text as semantic vector and apply them to solve multiple high-level text understanding tasks. The neural language models not only provide effective and general representations for learning the semantics of text, but also can obtain state-of-the-art performance on a variety of text understanding tasks. In this dissertation, we studied several Deep Learning techniques, including multi-task learning, transfer learning and multi-lingual learning and applied them to solve a variety of text understanding tasks, such as Semantic Textual Similarity, Textual Entailment and Semantic Relation Extraction. First, we explored the multi-task learning models for Semantic Relatedness task and Textual Entailment task. In Natural Language Processing, there exist several tasks that are highly related to each other. Instead of using single task learning that optimize each task specific system independently, we can optimize all tasks simultaneously using one multi-task learning model. We selected two related NLP tasks, Semantic Relatedness task and Textual Entailment task, and trained them jointly with a variety of multi-task learning models. This study explored if multi-task learning can outperform the performance of single-task learning on related NLP tasks. Second, we studied the transfer learning models for Semantic Relation Classification task. Domain adaptation is a common issue in Natural Language Processing, due to the fact that creating corpora in novelty domain requires a large human annotation effort. In neural language models, researchers made several attempts to find universal sentence embedding methods, aiming to obtain general-purpose sentence embeddings that can be widely adopted to a wide range of NLP tasks. We conducted an experiment to evaluate if transfer learning can help to train general-purpose sentence encoders for Relation Classification task using a limited-size of training data. Last, we studied the neural language models in multi-lingual domains. Specifically, we created models to solve the Semantic Relation Extraction task for Chinese and Semantic Textual Similarity task for Arabic, English and Spanish. We aim to show that, compared to traditional feature based methods, neural language models can (1) achieve state-of-theart performance without any or very few manually designed features, and (2) learn general sentence representation regardless of languages and domains.