Re: WRONG!
"They are clearly not fit for purpose".
We don't actually know that, we as humans are also guilty of bias and what one person deems to be fair might be massively unfair to someone else. We as humans are also guilty of using "balance" and "fairness" interchangeably.
For example, lets assume that we have 5 groups of objects. One group is considerably larger than the other 4 groups and those 4 groups don't add up the same number of objects in the large group...so it looks like this:
Group 1: 55 objects
Group 2: 14 objects
Group 3: 9 objects
Group 4: 6 objects
Group 5: 2 objects
In this example, approximately two thirds of the total number of objects are in group 1.
Most people I think would define "fair" as every object (regardless of the group it is in) being treated as equal as in, every object has an opportunity to be sorted regardless of the group it is in.
Let's imagine our task here is to filter these objects (regardless of group) for quality in order to create a group comprised of the best objects regardless of starting group. The reason for their starting groups is centred around an objective attribute. The number of sides they have. So one group comprises of objects with 5 sides, another one might be 8 sides etc...it's an attribute that can be objectively measured. Each group contains every example of an object in that category, until now, they have never been filtered for anything other than their number of sides.
Now to filter for quality, we are looking for those items where all sides are equal length because we have determined that objects with sides that are all equal length are the highest quality. An object with more sides is not automatically better than an object with fewer sides. Number of sides is irrelevant...therefore your starting group is irrelevant.
So now we measure the sides of each object, we know from previous runs that around 10% of objects, regardless of group, typically pass our objective filter that checks for equal sides.
We've sent the objects that don't pass our objective criteria home, and the ones that passed are now in the next room...so our cohort sorted by their original groupings now looks like this:
Group 1: 5 objects
Group 2: 1 object
Group 3: 1 object
Group 4: 1 object
Group 5: 0 objects
Controlling objectively for quality based on measurable attributes, we now have a cohort that has zero objects from Group 1 and proportionally fewer from Groups 2, 3 and 4 and zero objects from Group 5. Groups 2, 3 and 4 are proportionally equal now, whereas prior to filtering, proportionally they were significantly different. Regardless of proportions though, every object we now have is objectively high quality.
Nothing about my process here was unfair, every object was treated equally, every object had the opportunity to be measured against the same criteria none of the attributes that placed them in their original groups affected the outcome. What affected the outcome here was the starting number of objects in each group.
The process was fair, but the result is imbalanced...but imbalance is not unfair.
This is where the shit hits the fan...because some people see an imbalance as being unfair and generally their solution to the imbalance is arbitrary..."well we need more of group 5 in the final cohort for it to be 'fair' so lets set a quota for a minimum number of objects from group 5"...a couple of problems arise from this...there might be a situation where there is a limited amount of space in the final cohort, so in order to satisfy this demand, we have to remove an object that got through from one of the other groups to accommodate the quota...there is no fair or objective way to do this and it makes the process of sorting by quality unfair because now all objects are not treated equally because objects from group 5 are now treated differently to objects from other groups. The second problem that arises, is how you choose an object from group 5 to represent the quota that has been set. Since you can no longer measure that object based on the same objective criteria as the rest, since allowing the other groups to set the criteria would be deemed "unfair" you have to leave it to group 5 to nominate it's own object...and group 5 might not be objective at all as a group...either way, we now have an object in the final cohort that arrived there without going through the same criteria as every other object from the rest of the groups.
The other option is to lower the criteria for quality...but now it's no longer necessary to be the best, you reduce the overall quality of your final cohort and if there are limited spaces in your final cohort, you potentially shift the balance again because you could end up with twice the objects in your final cohort than you have space for...so we now need a second process to filter down the cohort...but we also still need to be "fair" in the eyes of the smaller groups...so we impose some arbitrary limits on the number of objects that can be in the cohort from a given group and at the same time we impose quotas to ensure a minimum number of objects from a given group to maintain a balance to appease those that deem an imbalance to be unfair...so now we're throwing high quality candidates in the bin, lowering the average quality of the cohort and imposing an objectively unfair process.
So how does this tie into AI?
Well, if you create a dataset that is as objective as possible (as in, you collect and use data that has been objectively filtered for quality using objective measurements) you might end up with a model that is imbalanced (or biased) but isn't unfair that will be of a relatively high quality...it might disproportionately represent or offend a small number of people, but it will be objectively fair. If you fuck with that data to try and address that imbalance or bias, then you are creating a dataset that is fundamentally unfair because it won't fairly represent anyone to any objective standard and will be much lower quality which limits its usefulness.
You can't subjectively make things fair. Fairness comes from objectivity...the same thing applies to correcting a bias...changing the criteria of what makes something salty, in order to shift some things from the salty category into the sweet category doesn't change the fact that the salty objects are objectively salty...you haven't fixed the bias here, you just created shit data that nobody can objectively rely upon.