Pneumothorax Segmentation of Chest X-Rays Using a Stacked Generalization Framework with Multiple Convolutional Neural Networks
Date
Authors
ORCID
Journal Title
Journal ISSN
Volume Title
Publisher
item.page.doi
Abstract
With advances in deep learning research, convolutional neural networks (CNNs) have achieved state-of-the-art results in common computer vision tasks such as image classification and image segmentation. In the past decade, most of the research in the domain of CNNs has focused on optimizing the mathematical structure or architecture of CNNs to improve performance on these tasks. Nearly a decade after the introduction of the first CNN architecture, state-of-the-art CNNs have surpassed human performance on image classification and segmentation tasks. Given their capability to achieve super-human performance on image-recognition tasks, in recent research, CNNs have been trained to perform difficult medical imaging tasks such as cancer segmentation. While researchers studying image segmentation have focused on optimizing specific CNN architectures to perform well on complex tasks such as medical image segmentation, there is a significant amount of unexplored potential in applying ensemble machine learning techniques to combine the predictions of multiple CNN architectures to produce more robust models. Ensemble machine learning is an area of machine learning that involves combining the predictions of several machine learning models with techniques such as majority voting, averaging, and stacked generalization in order to produce models with lower generalization errors. Stacked generalization is a powerful technique that involves training a higher-level model that aggregates the predictions of lower models and each input instance to generate final predictions. This research proposes and evaluates a framework for applying stacked generalization to combine the predictions of multiple CNNs and improve performance on medical image segmentation tasks. The proposed method allows researchers to combine different state-of-the art CNN architectures into a larger neural network that uses the predictions of each individual CNN and the properties of the input image to generate a more accurate set of predictions. We evaluate the effectiveness of this method by comparing the performance of individual CNNs and the proposed method on a dataset for medical image segmentation from the 2019 SIIM ACR Kaggle Pneumothorax Segmentation Challenge.