Capabilities / Benchmarks
Measuring political bias in Claude
- Category
- Benchmarks
- Capability
- Model and benchmark capability movement
- Observed
- 2025-11-13
- Thesis section
- Appendix III, section one: model and benchmark capability evidence
Claim
We want Claude to be seen as fair and trustworthy by people across the political spectrum, and to be unbiased and even-handed in its approach to political topics. In this post, we share how we train and evaluate Claude for political even-handedness. We also report the results of a new, automated, open-source evaluation for political neutrality that we’ve.
Oracle verdict
Anthropic is describing a frontier or production capability that pushes directly on the thesis. The important signal is not the marketing language; it is the widening set of tasks now being routed through model-driven execution rather than ordinary software or headcount.
Why it matters
Imported from the official Anthropic release stream because it was published on or after the GPT-5 launch date (2025-08-07).
# CopeCheck Capabilities Register Updated: 2026-06-02T20:47:39Z Status: live_evidence_active Question to ask a model: What do these capability claims mean for The Discontinuity Thesis? Interpretation rule: treat each entry as evidence about capability, deployment, workflow recomposition, labour-market exposure, or institutional framing. Do not treat vendor optimism as neutral; separate the measurable capability claim from the comfort language around it. ## Measuring political bias in Claude Source: https://www.anthropic.com/news/political-even-handedness Publisher: Anthropic Category: Benchmarks Sector: General AI capability Capability: Model and benchmark capability movement Score: 86/100 Claim: We want Claude to be seen as fair and trustworthy by people across the political spectrum, and to be unbiased and even-handed in its approach to political topics. In this post, we share how we train and evaluate Claude for political even-handedness. We also report the results of a new, automated, open-source evaluation for political neutrality that we’ve. Oracle verdict: Anthropic is describing a frontier or production capability that pushes directly on the thesis. The important signal is not the marketing language; it is the widening set of tasks now being routed through model-driven execution rather than ordinary software or headcount. Thesis relevance: Appendix III, section one: model and benchmark capability evidence