Using Large Language Models for Qualitative Analysis can Introduce Serious Bias

Julian Ashwin, Aditya Chhabra, Vijayendra Rao

DSEID: DSEID-001-4453559
DOI: 10.1177/00491241251338246
Journal: Sociological Methods & Research
Publisher: SAGE Publications
Published: 2025-5-27
Status: metadata_only

Abstract

Large language models (LLMs) are quickly becoming ubiquitous, but their implications for social science research are not yet well understood. We ask whether LLMs can help code and analyse large-N qualitative data from open-ended interviews, with an application to transcripts of interviews with Rohingya refugees and their Bengali hosts in Bangladesh. We find that using LLMs to annotate and code text can introduce bias that can lead to misleading inferences. By bias we mean that the errors that LLMs make in coding interview transcripts are not random with respect to the characteristics of the interview subjects. Training simpler supervised models on high-quality human codes leads to less measurement error and bias than LLM annotations. Given that high quality codes are necessary in order to assess whether an LLM introduces bias, we argue that it may be preferable to train a bespoke model on a subset of transcripts coded by trained sociologists rather than use an LLM.

Metadata is indexed. Open-access discovery has not completed for this record yet.

Publisher or DOI landing page

PDF

No local PDF is available.

GROBID Extracted text; discontinued.

This text is generated from TEI extraction for accessibility, search, and TTS. Formulas, tables, figures, page layout, and references may not perfectly match the original PDF.

No accessible text representation is available. The text extraction service has been discontinued for the time being. If you require this service, for accessibility or any other reason, please submit an issue/request on this page.

Metadata

Title: Using Large Language Models for Qualitative Analysis can Introduce Serious Bias
Delta ID: DSEID-001-4453559
Authors: Julian Ashwin, Aditya Chhabra, Vijayendra Rao
Abstract source: crossref
Source URL: None
Access: closed_or_uncertain
Licence: unknown
PDF SHA-256
TEI SHA-256
GROBID

Issues

No public issues have been filed for this DOI.

Submit an issue

Record history

When	Event	Field	Old	New
2026-06-18 19:37:53.011249+00:00	identifier_assigned	DSEID		DSEID-001-4453559