Effective Learning with Heterogenous, Noisy, Multi-Relational Healthcare Data




Journal Title

Journal ISSN

Volume Title




With the rise in development of machine learning models, a lot of progress has also been made in their applications to real world problems. Healthcare forms one of the most critical area for machine learning research as it directly impacts the life of the general population. Although a lot of work has been done and is continuing being done in the area of machine learning for healthcare, a variety of open problems on handling the underlying noisy structure of the data and multi-modality of data, still exist. Classical machine learning requires the data to be in the form of a flat feature vector but with real world data and especially healthcare, this is seldom the case. The data is almost always multi-modal i.e. consists of different data types such as, relational (graph), images and text, to name a few. Naturally, the machine learning models being developed for healthcare are thus expected to take advantage of the varying types of data provided to it and learn more effectively. Naive solutions take advantage of a large amount of available data to learn robust models. The fallacy of such models is generalization, since in healthcare domain the model can be posed with unseen tasks, such as identifying a new strain of virus, identifying a new drug etc., which can result in its failure. Thus taking advantage of human experts becomes an important part while developing models for such high impact tasks. Humans and machines can work in unison to handle the problem of generalization and can learn from one another. This dissertation explores these challenges in depth and tries to address them individually before presenting a unified view of how to overcome these challenges within a machine learning framework. We analyse and present detailed solutions to each challenge and also show how they are interconnected. We understand the importance of learning machine learning models from different modalities of data and gain insights about why moving from a propositional setting to a more general relational setting is important. We also depict the importance of human experts and how they can be crucial in high impact tasks such as healthcare. One of the chapters clearly shows that even if the machine learning models are adversarial among themselves, using human expert as an ally in the adversarial setting can help in learning far more effective models. Thus this dissertation proposes several methods to overcome the most glaring problems in developing machine learning models for healthcare tasks and develop effective models. It also outlines several challenges that lie ahead and need to be overcome in order to realize the complete potential of the changes machine learning can bring in healthcare.



Machine learning, Medical records -- Data processing, Medical care -- Data processing, Artificial intelligence -- Medical applications



©2020 Devendra Singh Dhami. All rights reserved.