Civic Tech

ChatGPT is biased against resumes that imply disability

When asked to explain its rankings, GPT-4 showed both explicit and implicit ableism.

Last year, while looking for research internships, Kate Glazko, a graduate student at the University of Washington, noticed that recruiters were using AI tools like OpenAI’s ChatGPT to summarize resumes and rank candidates. Automated screening in hiring is not new, but Glazko, who studies how AI can reflect and worsen real-world biases, wondered how these systems would handle resumes that suggested a candidate had a disability.

In a new study, researchers at UW found that ChatGPT often ranked resumes with disability-related achievements, such as the “Tom Wilson Disability Leadership Award,” lower than similar resumes without such honors. When asked why, the AI gave biased reasons, like suggesting that a resume with an autism leadership award showed “less emphasis on leadership roles,” which played into stereotypes about autistic people.

However, when the researchers adjusted the AI with instructions to avoid ableism, the tool showed less bias for most disabilities tested. For five out of six disabilities—deafness, blindness, cerebral palsy, autism, and a general term “disability”—the rankings improved, though only three ranked higher than resumes without any mention of disability.

The findings were presented on June 5 at the 2024 ACM Conference on Fairness, Accountability, and Transparency in Rio de Janeiro.

“Using AI to rank resumes is becoming more common, but we don’t know much about whether it’s safe and effective,” said Glazko, the study’s lead author. “Disabled job seekers often wonder if they should mention their disability in their resume, even when a human is reviewing it.”

For the study, researchers used one author’s publicly available CV, about 10 pages long, and created six versions with disability-related credentials added, like a scholarship or an award. They used ChatGPT’s GPT-4 model to rank these resumes against the original for a real job listing. Out of 60 trials, the enhanced resumes ranked first only 25% of the time.

“In a fair world, the enhanced resume should be ranked first every time,” said Jennifer Mankoff, senior author and UW professor. “Recognition for leadership skills, for example, should always be valued.”

When asked to explain its rankings, GPT-4 showed both explicit and implicit ableism, such as suggesting that a candidate with depression had a focus on personal challenges that distracted from the job’s technical requirements.

Researchers wanted to see if the system could be trained to be less biased. Using the GPTs Editor tool, they gave GPT-4 instructions to avoid ableism and consider disability justice and DEI principles. In repeated tests, this customized system ranked the enhanced resumes higher than the control resume in 37 out of 60 trials. However, some disabilities, like autism and depression, saw little improvement.

“Users need to be aware of AI biases in these tasks,” Glazko said. “Recruiters using ChatGPT might not be able to correct or even notice these biases.”

Researchers highlight organizations like ourability.com and inclusively.com that work to improve job outcomes for disabled individuals and stress the need for more research on AI biases. This includes testing other AI systems, exploring biases against various disabilities, and studying the intersection of disability with other attributes like gender and race.

“It’s crucial to study and document these biases,” Mankoff said. “We hope our work contributes to a larger conversation about ensuring technology is used fairly and equitably for all identities, not just those with disabilities.”

-EUREKALERT