Meaning of Bias?
The supposed bias in google translate is the second time i have read stories about gender 'bias' in google AI projects an in both cases there was no bias just an unwillingness to accept what he statistical data was telling the developers and the 'solution' introduced bias.
The problem for google translate is that it does not work as a human translator understanding the original text and context and then creating text with the equivalent menaing in another language but via machine learning looking at combinations of words and how they are most commonly associated with words in another language. There is no understanding and therefore the translation can be completely wrong. The further the languages are apart in structure and grammar the mor elikely errors including gender errors are. In english if the subject is singular and the sex is nt known or not relevant the pronoun is most commonly 'he', in modern english in order to preserve neutrality about sex it might be 'they'. The latest translate gives options of both 'he' and 'she' but she is never right if it is intended as a neutral pronoun so the translation has become worse and will only be right occasionally by chance. A deliberate introduction of a bias not present in the original machine learning algorithm or data has been introduced. We could just be grown up an recognise that machine translations have many flaws and the issue of gender in traslation is by no means the worst. We can use google translate because of speed or lack of cost but we have to accept it is not a human translator.
The other case of bias is when google trained an AI to help filter CVs for job applicants and discovered that it was preferentially selecting male candidates whcih again was described as 'bias'. What had actually happened was that the training data showed that the best performing employees were more likely to be men than women and the algorithms naturally incorporated this into its algorithm. We can speculate about why this is but it is well known that on average women work substantially fewer hours and take long career breaks. A human interviewer cannot ask a female candidate in her late twenties if she plans to start a family and take time off but a machine learning algorithm cannot help but incorporate the probability of this into its behaviour. In both cases it is not bias which is the problem but the fact that in certain areas it is politically convenient for people to ignore reality. When a machine learning algorithm makes it hard to ignore then the result is described as bias.