Deep Learning Methods for Improving Event Extraction on Political and Social Science Studies
Political and social scholars increasingly rely on event coders, which are automated systems that extract structured event representations from news articles, in order to monitor, ana- lyze and predict conflicts and affairs involving political entities across the globe. However, the existing event coders rest on outdated pattern matching techniques, relying on large manually maintained dictionaries composed of lexico-syntactic patterns designed for cap- turing conflict events. Apart from the high costs, time and specialized knowledge required to update and expand such dictionaries, these techniques do not support event extraction on multilingual corpus. As a consequence, the application of existing systems often yields low-recall results and imposes limitations when working with sources coming from different countries and languages. In this dissertation, we propose deep learning based frameworks to obtain state-of-the-art results for extracting structured events from natural language text in political and social sciences domains. We do so by exploring three main directions: (i) automatically extending the external dictionaries and knowledge bases utilized in the current event coders through knowledge extraction techniques; (ii) formulating the event coding task as a classification problem and proposing a supervised deep learning model to solve it; and (iii) developing an innovative deep neural network design by combining state-of-the-art lan- guage representation models with multi-task learning technique to efficiently extract events in a structured format from multilingual corpus. We demonstrate the superiority of our ap- proaches through conducting extensive experiments on real-world multilingual corpora based on political science and conflict domains.