Efficient Fair Learning With Subset Selection
Date
Authors
ORCID
Journal Title
Journal ISSN
Volume Title
Publisher
item.page.doi
Abstract
Fairness in Machine learning has become crucial nowadays since most of the scenarios involve data having a significant impact on people’s lives. The main purpose of this paper is to introduce a fair efficient learning technique with subset selection. Secondly, machine learning models are data-hungry. Training the state-of-the-art on large datasets requires a significant amount of computation resources and time. Thirdly, most of the existing techniques deal with ample changes either in model processing, data pre-processing, or post-processing making it difficult to adopt in real-time applications. In this approach, we address these issues by introducing a fair efficient machine-learning strategy with subset selection. The approach follows the GLISTER strategy, a mixed continuous and discrete bilevel optimization approach to perform data subset selection of the training data, by keeping the inner optimizer a standard training algorithm and incorporating iterative processes to select a subset in the outer optimizer. The strategy iteratively selects a subset to achieve our goal of improving fairness without majorly sacrificing accuracy. The paper mainly focuses on three significant fairness metrics - demographic parity difference, equalized odds difference, and equal opportunity difference. Experiments are conducted on several real-time different domain datasets and have seen a comparable and better performance against the other fair learning and data subset selection techniques.