This essay was originally submitted as a final paper for Ethics and Public Policy in Computing (17–200) in the Fall 2020 semester at Carnegie Mellon University.
Currently, our society is plagued with many examples of racially biased Machine Learning and Artificial Intelligence Algorithms which are being used all across the country, impacting minority groups, particularly African Americans, in many different aspects like health care, predictive policing, facial recognition, and hireability. This paper investigates how our society is currently handling this serious issue. I will explore how it is that Machine Learning models can have biases by analyzing what kind of data is used to train models, how lack of auditing and transparency about auditing is allowing biased algorithms to be released to the public, how money, efficiency, and a widespread belief that algorithms are less biased than people is allowing these algorithms to continue to be used in our society, followed by a discussion of how stricter regulations, education, and a bigger push for diversity in the tech industry can help reduce the probability of a biased algorithm being released and used in the public and what these types of changes will imply.
As our technology advances, it becomes more powerful, thus having a higher potential to impact any one person’s life. With the rise of Machine Learning and Artificial Intelligence we are beginning to see a lot of really important decisions become automated in the name of efficiency. Many people believe that these models are objective since they just give us predictions based on data, so how can these models be biased?
Bias can stem from many areas. First, models are trained on data and proxies, of which may be biased. Biased inputs results in biased algorithms. This will be explored more in our discussion of models used for health care and predictive policing. Secondly, it is common that when developers create their models they use data sets that have a small representation of minorities. For example, a commonly used dataset “features content with 74% male faces and 83% white faces.” This leads to poor facial recognition technology for African Americans, as we saw in 2015 when Google’s facial recognition algorithm “tagged two African Americans users as gorillas.” Finally, AI algorithms pick up on our own biases as a society. A famous example of this is Microsoft’s Tay, which is an “AI chatbot that spread disturbingly racist messages after being taught by users within a matter of hours.” The bottom line is: Machine Learning and Artificial Intelligence Algorithms can be biased, and in our society, we are currently using these biased algorithms and it is having a detrimental impact on minority groups.
Machine Learning and Artificial Intelligence Algorithms can be biased. Until our society can hold these algorithms to higher standards through the appropriate legislation and regulations, we should not be using these algorithms, especially in areas that seriously affect people’s lives like health care, policing, hiring, and many more places where we are allowing injustice to occur. Background Review Health Care First, let us discuss what happens when we use proxies that may have a racial bias in the healthcare system. In 2019, it was discovered that “an algorithm widely used in US hospitals to allocate health care to patients” of which 200 million people had been affected by each year in the U.S., “[had] been systematically discriminating against black people.” The algorithm assigns a risk score so that those with a higher risk score are marked as needing more specialized care. It had been found that “At a given risk score, Black patients are considerably sicker than White patients, as evidenced by signs of uncontrolled illnesses. Remedying this disparity would increase the percentage of Black patients receiving additional help from 17.7 to 46.5%.” As we can see this is a serious problem as this algorithm is responsible for many Black patients who were not marked as being as sick as they were and were delayed or denied care as a result.
Why was this algorithm biased? This is because the algorithm was trained on the proxy of health care costs, making the assumption that someone who has spent more money on health care is probably sicker. Although the “average black person in the data set … had similar overall health-care costs to the average white person… the average black person was also substantially sicker than the average white person.” Thus because African Americans spend less on health care in general, the algorithm assumed they were less sick than a white person who spent the same amount of money.
Next, let us consider how the African American community is being affected by predictive policing algorithms. There are two kinds of predictive policing tools: location based algorithms like Pred Pol which is like a weather forecast for crime in an area or people based algorithms like COMPAS which gives someone a score between 1 and 10 of how likely they are to be rearrested if released. Both algorithms are used in dozens of U.S. cities and jurisdictions. An article published by ProPublica found that “black defendants were twice as likely to be incorrectly labeled as higher risk than white defendants. Conversely, white defendants labeled low risk were far more likely to end up being charged with new offenses than blacks with comparably low COMPAS risk scores.”
Why are these algorithms biased? They are fed data about arrest rates, and “according to US Department of Justice figures, you are more than twice as likely to be arrested if you are Black than if you are White. A Black person is five times as likely to be stopped without just cause as a White person.” Thus we see that using arrest rates to train a model will result in a biased algorithm as it is trained on our own societal biases. Also, researchers at Georgetown Law School found that, “an estimated 117 million American adults are in facial recognition networks used by law enforcement, and that African-Americans were more likely to be singled out primarily because of their over-representation in mug-shot databases.” Although these algorithms are mandated by law not to consider race as a factor, the combination of overrepresentation due to our historical biases paired with proxies like zip code, socioeconomic background, and education can be used to find African Americans and identify them as being higher risk or associate highly concentrated Black communities as needing more policing. This amplifies our current biases as these algorithms only cause us to target African Americans.
Why does this matter? The UK government’s Centre for Data Ethics and Innovation found “identifying certain areas as hot spots primes officers to expect trouble when on patrol, making them more likely to stop or arrest people there because of prejudice rather than need.” This creates a feedback cycle that amplifies the current bias in arrest rates.
Furthermore, this affects the hireability of African Americans. “The labor market is already shaped by a technology that seeks to sort out those who are convicted, or even arrested, regardless of race … When such technological fixes are used by employers to make hiring decisions in the name of efficiency, there is little opportunity for a former felon … to garner the empathy of an employer.” The injustice caused by this one predictive policing algorithm extends not only to hindering the freedom of African Americans and amplifying racial biases in policing, but it also directly results in less diversity in our workplace.
How are politicians responding to this bias? Recently, Congress has “established a 19-person federal advisory committee to track the growth and recommend best practices” in their “Future of AI Act”. We can see that the federal government is trying to start to regulate the industry more. Recently New York has “instituted legislation to establish its own task force to … facilitate the testing of algorithms, determine how citizens can request input on algorithmic decisions, … and investigate whether the source code for city agencies should be publicly available. The city will also explore if certain algorithms are biased against certain residents and how they hold companies accountable.” The only issue is that company controlled algorithms are not usually open and available to city governments, making their power minimal. In Allegheny County, PA, “the Department of Human Services has built its own algorithm to better manage children protective services … Working with researchers, county human service officials created the Allegheny Family Screening Tool, which improves the staff’s decision-making processes by refocusing resources on children with the highest needs.” The main difference between New York and Allegheny County is that Allegheny County owns their algorithm, so they can do all the things New York wants to do, and is not limited by private companies’ rights. Therefore we see that our policies are evolving to try to tackle these issues, but it does not yet protect the people from private companies.
Some like Ruha Benjamin, author of Race After Technology and a sociologist at Princeton University in New Jersey, believe that a part of the problem is “a lack of diversity among algorithm designers, and a lack of training about the social and historical context of their work” since it would help inform the process of designing the models. Some people like Rayid Ghani, a computer scientist at Carnegie Mellon University in Pittsburgh, acknowledge that Machine Learning algorithms are sometimes biased and strongly supports auditing them, but also says, “his team has carried out unpublished analyses comparing algorithms used in public health, criminal justice and education to human decision making. They found that the machine-learning systems were biased — but less so than the people,” so he is not as concerned with the current trend of our society. Similarly, Melissa Hamilton studies legal issues around risk assessment tools, and she “is critical of their use in practice but believes they can do a better job than people in principle. ‘The alternative is a human decision maker’s black-box brain,’ she says.”
What should we do about biased predictive policing algorithms? As more people criticize these types of algorithms claiming, studies have shown that “these tools are not fit for purpose” others believe these biases can be remedied via, “algorithmic affirmative action, in which the bias in the data is counterbalanced in some way. One way to do this for risk assessment algorithms, in theory, would be to use differential risk thresholds — three arrests for a Black person could indicate the same level of risk as, say, two arrests for a white person.” This is very controversial as many are uncomfortable with the idea of holding different groups of people to different standards, even if it is to counterbalance bias.
Not only is there debate about what we should do with these biased models, but for how we should define fairness. Alexandra Chouldechova, Assistant Professor of Statistics & Public Policy at Carnegie Mellon University, who also studied ProPublica’s COMPAS findings, says “focusing on outcomes might be a better definition of fairness. To create equal outcomes, … ‘You would have to treat people differently.” She revised the formula to ensure equal rates of false positives which lead to 59% accuracy for whites and improved the accuracy for African Americans from 63% to 69%, however, counties have not changed the way they interpret COMPAS scores to account for this research. However, some believe that the price of this notion of fairness is a sacrifice in public safety, where we see there was a study performed that found “optimizing for public safety yields stark racial disparities; conversely, satisfying past fairness definitions means releasing more high-risk defendants, adversely affecting public safety.”
When it comes to issues like health care and policing, we should not be using algorithms unless they have been thoroughly audited. We cannot as a society accept the release of blatantly biased algorithms, even if they are “less biased than people” as many claim, because we create a feedback cycle that only amplifies the current biases in our society. This will only make our algorithms more biased with time. If we accept biased health care algorithms we as a society accept inequality, because we are acknowledging that Black patients are not getting the care they need and we will allow this issue to become worse in the name of efficiency. When we use predictive policing algorithms of which “encode patterns of racist policing behavior” we prime the police to be more likely to stop an African American than a White American, leading to a higher arrest rate of African Americans than White Americans. When we use biased predictive policing algorithms we allow the continuation and worsening of incarceration disparity in America and reduce the diversity of our workplaces. As Heaven puts it “feeding this data into predictive policing tools allows the past to shape the future,” and since our past is biased against minority groups, we will continue to be biased towards minority groups. This is unacceptable on the principle of equality.
Also, we must demand transparency. For example, Hamid Khan, an activist who fought for years to get the Los Angeles police to drop PredPol, has tried to also investigate another algorithm called OASys. “[Hamilton] has repeatedly tried to get information from the developers, but they stopped responding to her requests.” This kind of information should be available. If these algorithms are to be used in practice, they must be transparent, because when it comes to arrests and incarceration, justice is more important than a companies’ profits.
Something else that is a part of the problem is the lack of diversity in tech fields. Studies from 2019 found that “80% of AI professors are men. People of color remain underrepresented in major tech companies. At a 2016 conference on AI, Timnit Gebru, a Google AI researcher, reported there were only six black people out of 8,500 attendees.” This is a problem because it means that the people who are being affected by these racially biased algorithms, are not being involved in the algorithm decision making, design, or testing.
The three things that need to be handled better in the development and design of these algorithms are using better data, being more socially aware of the histories of racial discrimination and social oppression, and finally to encourage a more diversity in the tech fields. In addition we must have transparency so that these algorithms can be challenged. If we can’t satisfy these requirements, we should not be using these algorithms in the first place.
Implications of Position
Every biased algorithm I have mentioned in this paper like Pred Pol, COMPAS, and Optem’s health risk tool, should be discontinued until it is unbiased and made transparent. This would be expensive as it would require hiring more police officers, judges, and staff/medical supplies for hospitals, but it is a necessity if we want to live in a society that values all lives equally. Although this would not eliminate bias in our society because of personal bias, it is better than systematically reinforcing and amplifying our societal biases.
If a company does want to release an ML/AI algorithm to the public, based on my proposal we need to introduce legislation that would enforce thorough auditing by a third party of these algorithms. It will no longer be enough to self audit and withhold information to the public. Although this would slow down the release of these models, it would ensure that our tax paying citizens are not being explicitly discriminated against in our judicial and healthcare system. We need this to uphold our values of equality, fairness, and transparency.
In order to raise awareness of racial discrimination and the complex relationship between social oppression over generations, universities will need to increase focus not only on technical knowledge, but on historical and social knowledge as well. In addition, companies should either hire a history/ethics consulting team to monitor the design process of these models and/or require that it’s employees do thorough research before developing these models so they can be informed on how certain data may be biased before training the models and releasing them.
Finally, we need to increase diversity in the tech field. This can be done by changing the way we recruit. Instead of scanning resumes with an algorithm (another opportunity for racial/gender bias), Michael Li, founder and CEO of The Data Incubator, a data science training and placement firm, suggests using Project-based assessments since “having a process that mimics workplace data science communications adds a whole new level to the assessment that gives loads of signals to the interviewers.”
Machine Learning and Artificial Intelligence algorithms are powerful tools that we are developing as a society to help us automate certain processes. They have the potential to make our society more efficient and run smoother. But we should not compromise our values of equality in the name of money, efficiency, and “they work well enough”. If we want to live in a society that does not discriminate based on the color of our skin, we must raise our standards for these algorithms. We must demand transparency from these companies. We need legislation to protect the people from a company that is impatient to get rich. We must work to change our workforce to encourage more historically conscious developers and increase the diversity in these fields. These changes will be challenging, frustrating, expensive, slow, but they are necessary to build a better tomorrow.
Laura Koye is a third year undergraduate student at CMU studying Math + CS. She’s interested in AI/ML, public health, and privacy. She’s excited to see how ML and AI will revolutionize our tomorrow, but she wants to make sure we, as a society, set high standards for these algorithms. In her free time, she enjoys volunteering, running, cooking, and playing guitar.