Data Generation for AI Fairness
Abstract
Abstract
The use of machine learning in some capacity in decision making processes has been increasing over the years. The focus of these models has been on improving accuracy which might
be why they are being increasingly used. However, an unintended consequence of this is
that in the pursuit of high accuracy our machine learning models might not make fair decisions. Earlier when humans outperformed machine learning models in these decision-making
processes, improving accuracy was the priority. However due to the increasing number such
models being deployed to make real world decisions, the question of fairness exhibited by
these models has come up. This is especially true in sensitive applications like criminal
sentencing and hiring. This has brought up the question of how to design models that both
(1) perform with a high accuracy and (2) make fair decisions.
Instead of having the designers of these predictors/classifiers focus on both aspects it is far
easier to have them focus on one. This is even more important as fairness is context dependent, that is even if a predictor is considered fair for one context, there might be scenarios
where the current classifier is considered unfair. We try removing the burden of designing
fair classifiers from the designers of these classifiers. This is an advantage of pre-processing
to improve fairness. We make changes to the data before the training process which removes
the burden of introducing fairness from the designers of these predictors. Our work explores
this approach of pre-processing to introduce fairness. As mentioned earlier, since we wish
to remove the burden of generating fair classifiers from the designers, we have chosen this
approach. In addition, it also enables us to test the fairness exhibited by various classifiers.
Our approach is to augment the original data to generate new samples. Using these samples,
we trained a new fair predictor.