Detection of Claims and Supporting Evidence in Wikipedia Articles on Controversial Topics
Abstract
Abstract
This thesis presents the task of argument mining, finding arguments within natural language texts, and reports on experiments combining techniques previously applied in disparate but related domains to the tasks of detecting claims and evidence and predicting the relationship of support from evidence to claim. A large corpus built and labeled at IBM Research, which has
been made freely available to other researchers, was used.
This thesis demonstrates the usefulness of that resource for argument mining experiments by applying a combination of techniques tried on other data from a different domain together with insights from discourse processing and machine learning. Features from discourse processing applications and a kernel method from machine learning which were expected perform well on the argument mining tasks were tested and compared. In a first published application, the subset tree kernel used with a support vector machine model was found to perform well for all three tasks. Previously other researchers detected claims using a similar tree kernel. The subset tree kernel was augmented with feature vectors as tried for claims by previous researchers and further improved performance was shown.