This article is about Technology

Racial bias in machine learning algorithms

Jessica Ferreira Soares

Published at
15 de October de 2021

Is there a Racial bias in Machine Learning Algorithms after all? Racial bias in machine learning algorithms.

Racial Bias: An Autonomous Systems’ Problem
Intelligent systems are becoming more and more popular these days, due to the ease and speed with which they
can perform a large number of tasks, often performing better than human beings.
In this context, the algorithms
used in the construction of these systems enable autonomous decision making through the use of machine learning
techniques or machine learning (ML) 3.

Machine learning algorithms have become popular due to their ability to solve complex problems without being explicitly programmed to do so, due to their ability to learn through data pattern recognition2. However, it is important to note that these techniques can be influenced by some kind of bias that has the potential to harm populations that already suffer from a historical disadvantage 3.

Unfortunately, the type of bias generated by these models is similar to the human bias in relation to race, sex, religion and other types of discrimination 2. For example, there is even a phenomenon known as the other-race effect (ORE) in which people recognize the faces of their own race more easily, which also occurs with certain algorithms that manage to hit the face of a particular ethnicity more accurately1.

Bias in algorithms is of particular concern in autonomous or semi-autonomous systems that do not involve people interfering in the loop to detect and compensate for existing bias4. Thus, in addition to the possibility of unfair treatment in relation to a group of specific individuals, the ML can contribute to the online invisibility of these groups by prioritizing some results over others 3

The problem of algorithmic bias has gained increasing media attention due to several technological issues 4. For example, Google research identified two black Americans as gorillas in an image search 3; the FaceApp application tends to lighten the skin of black people to follow the standard of beauty the algorithm used 3; Systems for predicting crime recurrence are biased in relation to a given racial group 4; both Microsoft AIs Tay and Zo exhibited anti-Semitic, racist and sexist behavior 2.

The situations presented above may be due to certain types of input data used to train the ML 2 model. In addition, the algorithm can adapt over time to the implicit and explicit social biases to which it is exposed, which consequently generate stereotyped and unfair profiles of people 3.

There is still no capable-of-solving-all-cases-of-algorithmic-biases solution. However, it is important to remember that when building autonomous systems, you want the algorithms used to be better versions of human beings, so racial bias and other types of harmful bias should be reduced as much as possible

Understanding the Origin of Algorithmic Bias

Machine learning algorithms must equally recognize the faces of all races if the correct image characteristics or features were used to analyze
the different racial groups. In this context, it is possible to identify the racial bias in algorithms due to the difference in accuracy between different
ethnic groups1.

Since 1990, racial bias began to be reported in face recognition tasks. In 2014, the accuracy of facial recognition significantly improved due to algorithms based on deep convolutional neural networks or deep convolutional neural network (DCNN) 1. Studies done with pre-DCNN algorithms reveal that race impacts face identification in ways that are difficult to predict.

Although so far there are few studies that assess racial bias in DCNNs, some trends have already been detected in certain neural network architectures. For example, the VGG-Face algorithm affects the facial recognition of the demographic group of black and young women. Furthermore, the older COTS and VGG-Face algorithms perform better on white faces while the two newer COTS and ResNet perform better on black faces 1.

Given the complexity of the algorithmic bias problem, it is possible to assess it not only for the ML models used, but also for other possible bias causes. In this context, DANKS et al. (2017) 4 state that biases can be classified according to their source, which can be: training data, algorithmic focus, algorithmic processing, context transfer and interpretation.

Bias due to input data: Input data used to train the machine learning model can cause bias. This is because the used images can vary in demographics, quality, lighting and point of view. In addition, it is important to remember that neural networks require a large volume of data to be trained and that the used images must present different identities in addition to demographic diversity 1.

A simple example of data bias is the development of an autonomous car trained with data mostly from a US city that should be used in practice anywhere in the country. This factor causes the algorithm to learn only regional traffic rules 4.

Bias due to algorithmic focus: Sometimes some data cannot be used due to legal restrictions, for example, making the model only have access to a specific dataset 4.

Bias due to algorithmic processing: Occurs when the algorithm itself is biased in some way. This factor is useful when you want to compensate for biases present in noisy or anomalous data 4.

Bias due to context transfer: This type of bias is present when an application is used outside the context for which it was built. Using the earlier mentioned autonomous car example, it would be a problem to apply in the UK, where people drive on the left side of Highway 4, a system built to operate in the US.

Interpretation bias: When machine learning method modeling definitions do not match the application.

How to mitigate bias in machine learning algorithms?

First of all, you must be careful when affirming the presence of bias in algorithms, verifying its source and defining the specifications of the standard rule to be followed4. There is still no bullet-proof method to solve the bias problems that exist today, so more research on computational methods to mitigate this problem is needed2.

Despite this, researchers suggest some actions to mitigate biases. For example, the training data should be chosen to represent the entire population being evaluated2

It is also possible to either develop a new algorithm that does not present bias for the referred application or use an algorithm with a bias that can eliminate or compensate for existing biases in the data4.

Public policies play an essential role in addressing and legislating model biases. In the United States there are some laws to mitigate bias such as the Fair Housing Act that prohibits discrimination in selling, financing or renting property. The American public law 88-352 of 1965 prohibits gender and race discrimination to hire, promote or fire 3. Despite the laws used to mitigate the explicit bias, the implicit and unconscious biases present in the algorithms still cannot be cut out by them 3 .

In this context, it is necessary that the legislative powers, technology professionals and companies work together in the construction of principles and values ​​to extinguish biases of autonomous systems.

Furthermore, there is a relationship between diverse work environments and algorithmic bias, especially considering that this discrimination may be intentional. For this reason, increasing diversity in technology teams is an essential factor in building fairer ML algorithms 3. There is also an alienation from high-tech companies that neither encourage nor welcome diversity in the workplace.

For example, there are less than 2% of black Americans in senior and executive positions compared to 3% of Latinos, 11% of Asians and 83% of whites in American high-tech companies. Even when people who do not have white skin get jobs in these companies, they feel socially isolated and this affects their participation and their tendency to quit 3.

Considering these factors, it is concluded that companies should not only hire different teams, but also invest efforts so that new employees feel welcomed to express themselves and contribute in the work environment.


After analyzing the impact of biases on algorithms and their various sources, it is possible to conclude that machine learning models that cause some type of discrimination must be analyzed, fixed and, if this is not possible, discarded3. The sources of bias presented in this article are not mutually exclusive, and there may be other taxonomies of types of bias sources. It is important to remember that algorithm bias can be good when it avoids mitigating the overall system bias4


1CAVAZOS, Jacqueline G. et al. Accuracy comparison across face recognition algorithms: Where are we on measuring race bias?. IEEE Transactions on Biometrics, Behavior, and Identity Science, 2020. 

 2FUCHS, Daniel James. The dangers of human-like bias in machine-learning algorithms. Missouri S&T’s Peer to Peer, v. 2, n. 1, p. 1, 2018. 

 3LEE, Nicol Turner. Detecting racial bias in algorithms and machine learning. Journal of Information, Communication and Ethics in Society, 2018. 

 4DANKS, David; LONDON, Alex John. Algorithmic Bias in Autonomous Systems. In: IJCAI. 2017. p. 46914697


Related Services

Accounting Security Talents

This article is about Tecnologia

Talk to us

Contact us and discover how we can help your company in the path to digital transformation.

manage cookies