Generating Fairness: Tackling Bias in Generative AI

8 min readMar 1, 2023

Valerie Morignat, Neural Art, 2023. www.TheAIforge.com

Generative AI models, such as DALL-E, Stable Diffusion, BERT, and GPT-4, have gained significant popularity due to their ability to create new content by learning from existing data. These foundational models are trained on extensive datasets using self-supervision at scale and are then applied to various downstream applications. While generative AI has the potential to enhance human productivity and creativity, it may also reinforce existing biases and assumptions in the data it analyzes and generates. For example, text-to-image generators may reinforce the idea that beauty is synonymous with lighter skin tones, while chatbots may associate male names with career-oriented words and female names with family-oriented words. This article explores the different types of biases that can arise in Generative AI and proposes strategies to mitigate them.

AI bias is a sociotechnical challenge.

Bias in technical systems is widely known as a distortion that undermines the representativeness of a statistical outcome. However, AI bias is not just a statistical phenomenon. It is a sociotechnical challenge that impacts the entirety of the AI lifecycle and can lead to discriminatory outcomes. For instance, the 2022 Stanford AI Index Report revealed that CLIP, a neural network that learns visual concepts from natural language supervision, misclassified images of Black people as nonhuman at a rate more than twice that of any other race. The importance of prioritizing fairness in the design and development of Generative AI models is highlighted by this example alone. However, creating fair AI systems is a complex task that involves careful consideration of various interdependent factors and thorough assessment of the model’s application context. The challenge is particularly pronounced regarding Generative AI models designed for creative content, as they rely on a broad range of culturally anchored symbols, aesthetic norms, social representations, and divergent historical legacies. To address these challenges and generate fairness at scale, designers and developers must tackle the three dimensions of AI bias: statistical and computational, human, and systemic biases.

Diversity in training data is crucial to avoid bias in Generative AI.

Statistical and computational biases often stem from non-representative or unbalanced training data. If certain ethnicities or cultures are underrepresented or excluded from the training data, Generative AI models may produce inaccurate or culturally insensitive representations, further marginalizing these groups. Another concern is the risk of reinforcing the hegemony of Western cultural and aesthetic norms, as AI models that generate stereotypical or biased depictions of underrepresented cultures can entrench and perpetuate these dominant values. To mitigate these risks, it is essential to ensure that training data sets are diverse, inclusive, and representative of the full spectrum of human experiences and perspectives. Only by building models that are grounded in a deep understanding of diverse cultures and identities can we realize the full potential of Generative AI. At the training data level, it is crucial to vigilantly monitor for potential red flags such as skewed data, missing feature values, or unexpected feature values that can significantly impact underrepresented groups. Statistical and computational biases can happen for various reasons, ranging from selection and sampling biases to algorithmic amplification, concept drift, and emergent bias. Creating diverse, inclusive, and representative training data sets requires a comprehensive approach encompassing data collection and algorithmic design. Gathering data from a range of sources and actively seeking out underrepresented voices and perspectives is a critical step in achieving this goal. Prioritizing the creation of models that accurately reflect the complexity and diversity of the real world requires a nuanced understanding of the social and cultural contexts in which they will be used, as well as a commitment to ongoing evaluation and refinement to ensure that these models remain fair over time.

Human cognitive biases can significantly affect the fairness of Generative AI models.

There is ample evidence that human cognitive biases can result in flawed decision-making within AI teams and biased AI models. For instance, Confirmation bias can lead AI team members to prioritize information that confirms their pre-existing beliefs or assumptions, leading to a lack of critical evaluation and an unwillingness to consider alternative perspectives. Anchoring bias can result in flawed decision-making and the propagation of inaccurate or biased assumptions based on incomplete or inaccurate initial information. Groupthink can lead to a lack of creativity, a reluctance to challenge assumptions or decisions, and a decreased likelihood of identifying biases in AI models.
As a result of human cognitive biases, an AI team may overlook the risks of perpetuating cultural degradation and amplifying biases prevalent in Western art when training a text-to-image model on image sets that primarily represent Western art. These risks range from women being objectified and hypersexualized to indigenous people being stereotyped and portrayed as primitive or exotic. Moreover, the limited geographic and cultural diversity of large training data sets, which mainly originate from North America and Europe, risk leading to the underrepresentation or misrepresentation of certain demographics and regions. To ensure fair and equitable generative AI outputs, AI teams must engage in Responsible AI practices, and foster critical evaluation and interdisciplinary collaboration with diverse stakeholders.

Systemic biases also remain a critical challenge in Generative AI.

This type of bias refers to the systematic advantages or disadvantages experienced by particular groups due to institutional practices, norms, and procedures. Despite their remarkable language processing capabilities, Large Language Models (LLMs) have shown to reproduce and amplify gender biases. These models are more likely to associate historically male-dominated professions with men and historically female-dominated professions with women due to societal prejudice in the training data and reflected in word embeddings. Word embeddings refer to words positioned close to each other in a high-dimensional vector space and are assumed to have similar meanings. Large pre-trained language models use these input representations to learn the relationships between words and their contextual meaning in the training data. To mitigate this type of bias in GPT-3, OpenAI developers have taken measures such as fine-tuning the model with diverse data sets and incorporating techniques like data augmentation and counterfactual data. However, since systemic bias is a socio-technical challenge, it is also crucial to implement a holistic approach that addresses all levels of the AI development process. That includes ensuring that the AI team is diverse and multidisciplinary. It also means involving diverse internal and external stakeholders, considering multiple dimensions of identity, and designing fair and transparent evaluation criteria. Additionally, it is crucial to empower end-users to detect biases in Generative AI models in order to promote equity and inclusivity. While technical experts can uncover certain biases, end-users offer unique perspectives and experiences that can lead to the identification of issues that experts may have overlooked. By enabling end-user feedback early after the model is put into production, we can proactively address blindspots and rectify unforeseen biases. Such an approach is integral in building trust and ensuring that Generative AI models are tailored to meet the diverse needs and experiences of the communities they serve.

Fairness is not a one-size-fits-all concept and must be tailored to the context.

The recognition that bias is a sociotechnical challenge has prompted a shift towards focusing on fairness as a means of addressing bias in AI. Fairness is often approached through the lens of fairness-related harms, with an emphasis on prioritizing the most affected groups. To measure bias and ensure fairness, a range of fairness metrics have been developed, which provide mathematical definitions of fairness. These include commonly used metrics such as Equalized Odds (which checks if false positive and false negative rates are equivalent across different groups), Predictive Parity (that verifies whether precision rates are equivalent for subgroups under consideration), Counterfactual Fairness (which measures if similar individuals are treated similarly, regardless of their membership in a protected group), and Demographic Parity (which verifies that the model’s classifications are not dependent on a given sensitive attribute).
However, not all fairness metrics are compatible with one another, and the definition of fairness must be contextualized for each situation. As such, it is essential to carefully consider the metrics used to measure fairness and the specific context in which they are applied.

On the mitigation front, various Fairness methods can be employed to address bias and promote fairness in AI. One such method consists in adding fairness constraints to algorithms, such as post-processing the model’s output after the model has been run, altering the loss function to embed a penalty for a fairness metric violation, or adding a mathematical constraint to an optimization problem.

Other methods such as adversarial training and counterfactual data augmentation are also relevant techniques for bias mitigation. Adversarial training exposes the model to perturbed inputs designed to highlight biases, training the AI to make more robust and unbiased decisions. Counterfactual data augmentation involves creating alternative versions of the original data by altering certain attributes. This process broadens the range of perspectives and situations the model encounters, reducing the impact of biases.

Ultimately, achieving fairness in AI requires a concerted, interdisciplinary effort to define and integrate Fairness metrics and methods into the development and deployment of AI models. This requires a commitment to understanding the nuances of fairness and how the value can be best applied to promote equitable, trustworthy, and inclusive models.

A comprehensive list of AI Fairness tools and frameworks is available to developers

To aid developers in this task, various fairness tools and frameworks have been developed. Here are some of the most commonly used AI fairness tools available to developers today.

IBM’s AI Fairness 360: a comprehensive toolkit for detecting and mitigating various forms of bias in machine learning models.
Microsoft Fairlearn: an open-source toolkit that empowers data scientists and developers to assess and improve the fairness of their AI systems.
Microsoft Responsible AI Impact Assessment Guide: a resource to explore responsible AI challenges.
Microsoft AI Fairness Checklist.
Fiddler Labs: a platform for building and deploying responsible AI applications, including tools for detecting and mitigating bias in machine learning models.
DataRobot’s AI Equity and Fairness Toolkit: tools and best practices for detecting and addressing bias in machine learning models.
Hugging Face’s Evaluate Library: libraries for NLP tasks that include pre-defined metrics for assessing fairness, such as the equal opportunity difference metric for assessing algorithmic fairness in binary classification.
NIST AI Risk Management Framework: A Responsible AI Framework designed to equip organizations and AI actors with approaches that increase the trustworthiness of AI systems.

Conclusion

Developers and content creators must recognize the immense impact of Generative AI on society and culture. Creating equitable, fair, and inclusive generative AI models requires a deep understanding of the historical and cultural influences in data sets, algorithms, and human decisions. Additionally, it demands the development of robust fairness tools and a commitment to mitigating potential risks that could undermine the reliability and trustworthiness of AI-generated outputs. That is a unique opportunity for us to celebrate and embrace the rich tapestry that makes up our shared human experience and to remember that our greatness lies not so much in our ability to remake the world but rather in our ability to transform ourselves (Mahatma Gandhi).

_________

===> Follow me to read my next publication on The Unique Challenges and Opportunities of Generative AI.

Generating Fairness: Tackling Bias in Generative AI

AI bias is a sociotechnical challenge.

Diversity in training data is crucial to avoid bias in Generative AI.

Human cognitive biases can significantly affect the fairness of Generative AI models.

Systemic biases also remain a critical challenge in Generative AI.

Fairness is not a one-size-fits-all concept and must be tailored to the context.

A comprehensive list of AI Fairness tools and frameworks is available to developers

Conclusion

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Dr. Valerie MORIGNAT

No responses yet