Machines Do See Color: Using LLMs to Classify Overt and Covert Racism in Text
Abstract
Extant work has identified two discursive forms of racism: overt and covert. While both forms have received attention in scholarly work, research on covert racism has been limited. Its subtle and context-specific nature has made it difficult to systematically identify covert racism in text, especially in large corpora. In this article, we first propose a theoretically driven and generalizable process to identify and classify covert and overt racism in text. This process allows researchers to construct coding schemes and build labeled datasets. We use the resulting dataset to train XLM-RoBERTa, a cross-lingual large language model (LLM) for supervised classification with a cutting-edge contextual understanding of text. We show that XLM-R and XLM-R-Racismo, our pretrained model, outperform other state-of-the-art approaches in classifying racism in large corpora. We illustrate our approach using a corpus of tweets relating to the Ecuadorian indígena community between 2018 and 2021.
Metadata is indexed. Open-access discovery has not completed for this record yet.
No local PDF is available.
GROBID Extracted text; discontinued.
This text is generated from TEI extraction for accessibility, search, and TTS. Formulas, tables, figures, page layout, and references may not perfectly match the original PDF.
No accessible text representation is available. The text extraction service has been discontinued for the time being. If you require this service, for accessibility or any other reason, please submit an issue/request on this page.
Metadata
Issues
No public issues have been filed for this DOI.
Submit an issue
Record history
| When | Event | Field | Old | New |
|---|---|---|---|---|
| 2026-06-18 19:37:53.011249+00:00 | identifier_assigned | DSEID | DSEID-001-8450391 |