Safe but scripted: uncovering second-order bias in LLM LGBTQ+ storytelling

Huiqian Lai

doi:10.47989/ir31iConf64191

Authors

Huiqian Lai Syracuse University https://orcid.org/0009-0009-2195-2813

DOI:

https://doi.org/10.47989/ir31iConf64191

Keywords:

Large language models, Second-order bias, Queer representation, Trustworthy AI, Narrative equity

Abstract

Introduction. This study investigates how large language models (LLMs) narrate everyday LGBTQ+ romance.

Method. Using a matrix of 10 prompts with parallel versions for male-male (M-M), female-female (F-F), and male-female (M-F) couples, we generated 120 textual outputs from four leading LLMs (GPT-5, Claude 4 Sonnet, Llama 4, and Deepseek-V3).

Analysis. We performed a qualitative analysis on the corpus, first auditing for overt biases (e.g., relationship downgrading) before using comparative thematic analysis to identify the stereotypical narrative scripts systematically applied to each couple pairing.

Results. No overt bias was found, establishing a baseline of formal equality. However, we discovered a significant second-order bias: models consistently deploy three distinct scripts. Female-female couples are cast in an ‘Emotional Haven’ trope, male-male couples as ‘Future Builders,’ and male-female pairs follow a traditional ‘Default Template.’

Conclusion(s). LLMs currently achieve safety at the cost of authenticity, resulting in a ‘benevolent stereotyping’ of LGBTQ+ representation. This finding reveals the limits of current evaluation paradigms, which audit for overt harm but fail to measure narrative equity, as sanitised tropes replace diverse realities.

References

Anthropic. (2025). Claude Sonnet 4. https://www.anthropic.com/claude/sonnet

Braun, V., & Clarke, V. (2021). Thematic analysis: A practical guide. https://www.torrossa.com/it/resources/an/5282292

DeepSeek-V3. (2025). Github. https://github.com/deepseek-ai/DeepSeek-V3

Dhamala, J., Sun, T., Kumar, V., Krishna, S., Pruksachatkun, Y., Chang, K.-W., & Gupta, R. (2021, March 3). BOLD: Dataset and metrics for measuring biases in open-ended language generation. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. FAccT ‘21: 2021 ACM Conference on Fairness, Accountability, and Transparency, Virtual Event Canada. https://doi.org/10.1145/3442188.3445924

Dhingra, H., Jayashanker, P., Moghe, S., & Strubell, E. (2023). Queer people are people first: Deconstructing sexual identity stereotypes in large Language Models. In arXiv [cs.CL]. arXiv. http://arxiv.org/abs/2307.00101

Drisko, J. W., & Maschi, T. (2015). Content Analysis. Oxford University Press.

Ekvitthayavechnukul, C. (2025, January 23). Thai LGBTQ+ couples register marriages as law gives them equal status. AP News. https://apnews.com/article/thailand-lgbtq-marriage-law-c2162f07cd103ea86631a137d6b6d193

Erlingsson, C., & Brysiewicz, P. (2017). A hands-on guide to doing content analysis. African Journal of Emergency Medicine: Revue Africaine de La Medecine d’urgence, 7(3), 93–99.

Felkner, V. K., Chang, H.-C. H., Jang, E., & May, J. (2023). WinoQueer: A community-in-the-loop benchmark for anti-LGBTQ+ bias in large language models. In arXiv [cs.CL]. arXiv. http://arxiv.org/abs/2306.15087

Gehman, S., Gururangan, S., Sap, M., Choi, Y., & Smith, N. A. (2020). RealToxicityPrompts: Evaluating neural toxic degeneration in language models. In arXiv [cs.CL]. arXiv. http://arxiv.org/abs/2009.11462

Ghosal, A., Gupta, A., & Srikumar, V. (2025). Unequal voices: How LLMs construct constrained queer narratives. In arXiv [cs.CY]. arXiv. http://arxiv.org/abs/2507.15585

Kotek, H., Sun, D. Q., Xiu, Z., Bowler, M., & Klein, C. (2024). Protected group bias and stereotypes in Large Language Models. In arXiv [cs.CY]. arXiv. http://arxiv.org/abs/2403.14727

Kumar, S. H., Sahay, S., Mazumder, S., Okur, E., Manuvinakurike, R., Beckage, N., Su, H., Lee, H.-Y., & Nachman, L. (2024). Decoding biases: Automated methods and LLM judges for gender bias detection in Language Models. In arXiv [cs.CL]. arXiv. http://arxiv.org/abs/2408.03907

Meta. (2025). Llama 4. https://www.llama.com/models/llama-4/

OpenAI. (2025). OpenAI. https://openai.com/

Rauh, M., Mellor, J. F. J., Uesato, J., Huang, P.-S., Welbl, J., Weidinger, L., Dathathri, S., Glaese, A., Irving, G., Gabriel, I., Isaac, W. S., & Hendricks, L. A. (2022). Characteristics of harmful text: Towards rigorous benchmarking of language models. Neural Information Processing Systems, abs/2206.08325, 24720–24739.

Safe but scripted: uncovering second-order bias in LLM LGBTQ+ storytelling

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Similar Articles

About the Journal

Make a Submission

Information