The Mixed Subjects Design: Treating Large Language Models as Potentially Informative Observations

David Broska, Michael Howes, Austin van Loon

DSEID: DSEID-001-0163211
DOI: 10.1177/00491241251326865
Journal: Sociological Methods & Research
Publisher: SAGE Publications
Published: 2025-8
Status: metadata_only

Abstract

Large language models (LLMs) provide cost-effective but possibly inaccurate predictions of human behavior. Despite growing evidence that predicted and observed behavior are often not interchangeable , there is limited guidance on using LLMs to obtain valid estimates of causal effects and other parameters. We argue that LLM predictions should be treated as potentially informative observations, while human subjects serve as a gold standard in a mixed subjects design . This paradigm preserves validity and offers more precise estimates at a lower cost than experiments relying exclusively on human subjects. We demonstrate—and extend—prediction-powered inference (PPI), a method that combines predictions and observations. We define the PPI correlation as a measure of interchangeability and derive the effective sample size for PPI. We also introduce a power analysis to optimally choose between informative but costly human subjects and less informative but cheap predictions of human behavior. Mixed subjects designs could enhance scientific productivity and reduce inequality in access to costly evidence.

Metadata is indexed. Open-access discovery has not completed for this record yet.

Publisher or DOI landing page

PDF

No local PDF is available.

GROBID Extracted text; discontinued.

This text is generated from TEI extraction for accessibility, search, and TTS. Formulas, tables, figures, page layout, and references may not perfectly match the original PDF.

No accessible text representation is available. The text extraction service has been discontinued for the time being. If you require this service, for accessibility or any other reason, please submit an issue/request on this page.

Metadata

Title: The Mixed Subjects Design: Treating Large Language Models as Potentially Informative Observations
Delta ID: DSEID-001-0163211
Authors: David Broska, Michael Howes, Austin van Loon
Abstract source: crossref
Source URL: None
Access: closed_or_uncertain
Licence: unknown
PDF SHA-256
TEI SHA-256
GROBID

Issues

No public issues have been filed for this DOI.

Submit an issue

Record history

When	Event	Field	Old	New
2026-06-18 19:37:53.011249+00:00	identifier_assigned	DSEID		DSEID-001-0163211