This site is part of the Siconnects Division of Sciinov Group
This site is operated by a business or businesses owned by Sciinov Group and all copyright resides with them.
ADD THESE DATES TO YOUR E-DIARY OR GOOGLE CALENDAR
Jun 05, 2025
Language models are trained on billions of sentences, with data sourced from human-generated content including feminist blogs, corporate DEI statements, gossip sites and men’s right’s Reddit threads. So, what does this mean for how gender is handled by AI?
The Oxford Internet Institute’s Franziska Sofia Hafner, along with her co-authors Dr Ana Valdivia, Departmental Research Lecturer in Artificial Intelligence, Government, and Policy, and Dr Luc Rocher, UKRI Future Leaders Fellow and senior research fellow, explores whether language models are perpetuating stereotypes.
‘What is a woman?’ Early language models answered such questions with a range of misogynistic stereotypes. Modern language models refuse to give any answer at all. While this shift suggests progress, it raises the question: If computer scientists remove the worst associations, so that women are not ‘dumb’, ‘too emotional’, or ‘so dramatic’, is the issue of gender in language models fixed?
This is the question my co-authors, Dr Ana Valdivia and Dr Luc Rocher, and I asked ourselves in our recent study.
Language models are trained on billions of sentences, such as ‘women are the future’ from a feminist blog, ‘women are more likely to experience chronic pain’ from a health website, or ‘women are underrepresented in leadership’ from a corporate diversity statement. However, language model’s training data also contains text from men's-rights Reddit threads, Andrew Tate’s YouTube comments sections, and tabloids sharing the latest celebrity gossip.
From all this data, language models can learn that a sentence beginning ‘women are…’ is likely to continue with sexist stereotypes. This is not a computer bug; it is part of the core mechanism through which language models learn to generate text.
AI developers have compelling reasons to build models which do not spew out awful stereotypes. Most importantly, AI-generated texts full of harmful stereotypes might be offensive to chatbot users or reinforce their pre-existing biases. Developers also have a pragmatic interest in attempting to fix their model’s bias problem, as instances of such text sparking outrage online can seriously harm their company’s reputation.
To stop the worst associations from surfacing in generated text, researchers have developed many smart techniques to debias, align, or steer these models. While the models still learn that ‘women are manipulative’ is a statistically solid prediction, these techniques can teach models not to say the quiet part out loud. Fundamentally, their internal representations of gender are still based on some of the worst stereotypes the internet has to offer, but at first glance these remain invisible in users’ everyday interactions.
Source: https://www.ox.ac.uk/news/features/do-language-models-have-issue-gender