Data Generation for AI Fairness
The use of machine learning in some capacity in decision making processes has been increasing over the years. The focus of these models has been on improving accuracy which might be why they are being increasingly used. However, an unintended consequence of this is that in the pursuit of high accuracy our machine learning models might not make fair decisions. Earlier when humans outperformed machine learning models in these decision-making processes, improving accuracy was the priority. However due to the increasing number such models being deployed to make real world decisions, the question of fairness exhibited by these models has come up. This is especially true in sensitive applications like criminal sentencing and hiring. This has brought up the question of how to design models that both (1) perform with a high accuracy and (2) make fair decisions. Instead of having the designers of these predictors/classifiers focus on both aspects it is far easier to have them focus on one. This is even more important as fairness is context dependent, that is even if a predictor is considered fair for one context, there might be scenarios where the current classifier is considered unfair. We try removing the burden of designing fair classifiers from the designers of these classifiers. This is an advantage of pre-processing to improve fairness. We make changes to the data before the training process which removes the burden of introducing fairness from the designers of these predictors. Our work explores this approach of pre-processing to introduce fairness. As mentioned earlier, since we wish to remove the burden of generating fair classifiers from the designers, we have chosen this approach. In addition, it also enables us to test the fairness exhibited by various classifiers. Our approach is to augment the original data to generate new samples. Using these samples, we trained a new fair predictor.