Proxy Issues—Resolving Discrimination in Algorithms

With regulators increasingly focusing on algorithmic discrimination, human intervention in programming predictive models and artificial intelligence (AI) will be more important than ever. While the list of positive uses for AI continues to grow, the algorithms can also lead to unintended discriminatory results, known as “disparate impact.” Algorithmic discrimination can occur when a computer model makes a decision or prediction that has the unintended consequence of denying opportunities or benefits more frequently to members of a protected class than to an unprotected control set. A discriminatory factor can infiltrate an algorithm in several ways, but one of the most common methods is when the algorithm includes a proxy for a protected class feature because unrelated data suggests that the proxy is predictive or correlated to a legitimate target result.

Before exploring proxy discrimination, it is important to note that AI and algorithms improve our daily lives in ways that benefit society. For example, the algorithms facilitate expedited credit assessments, allowing consumers to be approved for a loan in minutes. Cryptographic algorithms enhance a consumer’s experience by facilitating digital signatures. Indeed, if you’ve ever used GPS while driving, you’ve benefited from an algorithm (algorithms determine a user’s location and map distance and travel time).

Proxy discrimination occurs when a face-neutral trait is used to replace a prohibited trait. Proxy discrimination has sometimes been used intentionally to evade rules prohibiting discrimination in lending, housing, or employment, such as the Fair Housing Act (FHA), Equal Employment Opportunity Act, Credit (ECOA) and the Equal Employment Opportunity Act (EEOA). A common example of proxy discrimination is “redlining” in the financial sector. In the mid-1900s, instead of openly discriminating on the basis of race in their underwriting and pricing decisions, some financial institutions used zip codes and neighborhood boundaries in place of race to avoid lending to predominantly African-American neighborhoods. In this case, proxies were used in place of the prohibited feature to achieve a discriminatory objective. But proxy discrimination need not be intentional.

When a proxy correlated with membership in a protected class is predictive of an algorithm’s legitimate goal, the use of that proxy may seem “rational”. For example, higher SAT scores may be correlated with better student loan repayment because the scores are designed to predict graduation rates, which are strongly correlated with loan repayment. But at the same time, there are racial disparities in SAT scores. Thus, underwriting algorithms that rely on a student’s SAT scores to approve or assess loans may inadvertently reflect racial disparity in SAT scores. Although unintended and seemingly rational, this disparate result could become the basis of an ECOA discrimination complaint.

Once a disparate impact exists, the onus is on the algorithm user to demonstrate that their practice serves a legitimate, non-discriminatory purpose rooted in business necessity. Even if a specific factor in the algorithm meets the legitimate business purpose standard, it may still violate the law if the algorithm could achieve its legitimate purposes with a less discriminatory alternative. For example, instead of using SAT scores to assess the likelihood of repaying a student loan, would grade point average (GPA) work just as well with less of a discriminatory effect? Humans overseeing AI should ask themselves these questions and work with statisticians and economists as needed.


So what is a human to do? First, the organization’s compliance staff must know what factors its artificial intelligence or algorithms use in decision models. Ask for a list of factors and what decisions will be made based on each of these factors. Second, compliance personnel should know which factors are prohibited in making certain decisions by checking relevant laws and regulations, for example, ECOA, EEOA, and GINA. Third, consider whether certain factors of the algorithm are directly prohibited by applicable law or whether they are logically related to these factors and therefore to potential “proxies”.

Determining whether your decision factors could be surrogates for discrimination requires human intervention and some research. Some examples of proxy discrimination can be illustrated by looking at gender. It is illegal under fair lending laws to use gender or any other representation of gender in the awarding of credits. So, look for factors that approximate gender, such as height, weight, first name, Netflix viewing habits, and shopping habits (e.g. what scent of shampoo you buy). In many cases, publicly available statistics can confirm whether a factor is strongly correlated with a protected characteristic.

Another example can be illustrated using the healthcare industry. Although health insurers are legally prohibited from using genetic testing under the Genetic Information Non-Discrimination Act (GINA), companies would be well advised to ensure that their algorithms do not not use proxies such as family medical history or visits to specific websites (e.g. disease helpline group). Yet another example of proxy discrimination can be illustrated based on age. For example, a face-neutral data point (years since graduation) is a clear indicator of age.

Once a proxy is suspected, the next step is to determine its impact on the decision, its usefulness in the model, and whether it causes or contributes to a discriminatory impact. This can be done through statistical methods and chart reviews. Once the factor has been identified and its impact quantified, it is time for human judgment to take over and decide whether the factor is truly necessary to achieve a legitimate goal, or whether a less discriminatory substitute may serve the same purpose.

The takeaway is this: any data (e.g. type of deodorant) can be taken out of context by an algorithm and lead to proxy discrimination. And that discrimination, regardless of intent, can lead to lawsuits and regulatory risks. Systemic and robust human oversight is a step in the right direction to avoid proxy discrimination and the liability that comes with it.

Comments are closed.