Efficient Fair Learning With Subset Selection


December 2023

Journal Title

Journal ISSN

Volume Title




Fairness in Machine learning has become crucial nowadays since most of the scenarios involve data having a significant impact on people’s lives. The main purpose of this paper is to introduce a fair efficient learning technique with subset selection. Secondly, machine learning models are data-hungry. Training the state-of-the-art on large datasets requires a significant amount of computation resources and time. Thirdly, most of the existing techniques deal with ample changes either in model processing, data pre-processing, or post-processing making it difficult to adopt in real-time applications. In this approach, we address these issues by introducing a fair efficient machine-learning strategy with subset selection. The approach follows the GLISTER strategy, a mixed continuous and discrete bilevel optimization approach to perform data subset selection of the training data, by keeping the inner optimizer a standard training algorithm and incorporating iterative processes to select a subset in the outer optimizer. The strategy iteratively selects a subset to achieve our goal of improving fairness without majorly sacrificing accuracy. The paper mainly focuses on three significant fairness metrics - demographic parity difference, equalized odds difference, and equal opportunity difference. Experiments are conducted on several real-time different domain datasets and have seen a comparable and better performance against the other fair learning and data subset selection techniques.



Fair learning, Machine learning, Computer science, Efficient fair learning, Subset selection