Paper

Epistemic Injustice in Language Models: An Audit of Pretraining Filters and Guardrails

arXiv:2606.05936v1 Announce Type: new Abstract: Modern language models rely on pretraining filters to remove undesirable content from training corpora and inference-time guardrails to suppress undesirable outputs during deployment. In this paper, we examine how these filtering and moderation decisions produce forms of epistemic erasure and reveal tensions both across automated systems and between these systems and human judgment. We audit four pretraining filters and three inference-time guardrails on Common Crawl sentences containing gender and regional-origin mentions, together with a manua…

arXiv cs.CLPublished 2026-06-05Paper link

Authors:

Topics

Relevant entities

People

Linked people will appear here.

Related coverage

Linked coverage will appear here.

Related events

Linked events will appear here.

Related discussions

Related discussion nodes will appear here.