AI and Online Images Are Reflecting Flawed Views of Working Women

AI and Online Images Are Reflecting Flawed Views of Working Women


Just how deeply embedded are social biases about gender and age? A new study published in Nature finds that inaccurate stereotypes about older women are not only pervasive in online images and videos but are perpetuated and amplified by large language models (LLMs).

While previous research has focused on age-related gender bias in specific settings, this research aims to “characterize a culture-wide trend,” explains Douglas Guilbeault, an assistant professor of organizational behavior at Stanford Graduate School of Business. In a series of large-scale studies conducted with Solène Delecourt, PhD ’20, of the University of California, Berkeley, Haas School of Business, and Bhargav Srinivasa Desikan of the University of Oxford/Autonomy Institute, Guilbeault found widespread evidence of bias against older women on popular image and video sites and in the algorithms that power popular AI tools such as ChatGPT.

The study explored how gendered expectations shape our mental picture of women at work — including a tendency to see women in certain jobs as younger than they are. Previous research into this bias has looked at “value-based judgments that it is ‘bad’ to be an older woman,” Guilbeault says. This new research took a broader view, exploring how assumptions about gender and age shape depictions of women in particular roles. “We were first looking at the statistical relationship,” Guilbeault says. “Before we even talk about bias — do people simply perceive women in some jobs as younger, period?”

Census data shows that women live longer than men, and there’s no evidence of an age gap in employment, broadly speaking. Yet Guilbeault and his colleagues found that younger women are overrepresented in online images. By analyzing over 1.4 million images and videos from online platforms such as Google, Wikipedia, IMDb, Flickr, and YouTube, the researchers found that women are systematically portrayed as younger than men, particularly in depictions of higher-status and better-paid occupations. Guilbeault and his colleagues also documented this gendered age gap in image databases used to train machine learning algorithms.

Quote

When ChatGPT creates a resume, it draws on ideas about what makes a good candidate and which skills are relevant, Guilbeault explains. “There are countless opportunities for stereotypes to slip in.”

Guilbeault and his colleagues also examined nine LLMs to “characterize patterns on the order of billions of words.” Here, too, they found a biased and distorted depiction of older women. “Why would it be showing up in billions of words where there’s no visual presentation of people?” Guilbeault asks. “That is really suggesting it’s woven into the fabric of how we categorize and interpret people in the social world.”

In another recent study, Guilbeault found that people exposed to online images of occupations were more likely to associate certain jobs with one gender. Even a 45-minute exposure significantly shifted people’s perceptions. That is concerning, Guilbeault says: When people spend all day online, what biases are they unconsciously absorbing?

This question gets to the heart of what Guilbeault seeks to understand — how people categorize one another and the language they use to do so. “The concepts we use really play a big role in creating the world that we live in,” he says.

Stereotypes Reflected in Resumes

To understand how these biases about women in the workplace are propagated by generative AI, Guilbeault and his colleagues prompted ChatGPT to generate more than 34,500 unique resumes for 54 occupations using typical male or female names.

The results were stark. When ChatGPT produced resumes for hypothetical women, it generated work histories that portrayed them as younger and less experienced. The researchers then asked ChatGPT to evaluate the quality of the resumes. When it considered the experiences and ages it had woven into these imagined resumes, it gave older men the highest ratings — even when they were based on the same initial information as women’s resumes.

This suggests that AI-based tools employers may use to review resumes may give older men an advantage while putting older women and younger job seekers at a disadvantage. Where older women and younger people may have already experienced discrimination in hiring, the LLM not only reflects but actively reinforces this bias.

How did these stereotypes and distortions make their way into ChatGPT’s algorithm? When asked to create a resume, “it’s drawing on countless ideas about what a person is, what a particular job requires, what makes a good candidate, and which skills are relevant,” Guilbeault explains. “And within that process, there are countless opportunities for stereotypes to slip in.”

For instance, if the role involves construction, the system may emphasize whether someone can fix things. “That taps into the stereotype that men are better at fixing things,” Guilbeault says. The model absorbs these interpretive biases from the massive pool of human data that shapes it, which “leaks into how it approaches the resume-generation problem.”

Since AI companies are secretive about their training methods, it’s nearly impossible to know exactly how generative models like ChatGPT pick up their biases. While the source is likely human-generated data, it’s difficult to pinpoint the specific origin of biased information. “The causal explanation is likely complex as well,” Guilbeault says. Still, he and his fellow researchers were able “to demonstrate, overwhelmingly, that this huge bias is basically everywhere.”

AI companies are aware of their models’ biases, Guilbeault notes. So far, their solution is to apply filters that block material flagged as biased or stereotypical. “This is a simplistic approach,” Guilbeault says, and it can miss more nuanced issues such as gendered ageism. While AI companies may “slap on another filter” to improve their output, this won’t resolve deeper flaws inside their models. “To make real progress, the bias has to be addressed at a fundamental level.”

Until then, Guilbeault urges anyone using generative AI to be “deeply, deeply cautious, and aware these models often distort reality.” The reality is that inaccurate assumptions about gender and age are more persistent than many people imagine. “There’s a widespread belief that the problem is basically solved,” Guilbeault says. “And it’s not.”



Source link