CB Threat Models

Chemical and biological weapons threat models defined in Anthropic’s Responsible Scaling Policy. These are two of the core threat models used to evaluate whether frontier models require additional safety mitigations.

CB-1: Known Weapons Production

A model has CB-1 capabilities if it has the ability to significantly help individuals or groups with basic technical backgrounds (e.g., undergraduate STEM degrees) create/obtain and deploy chemical and/or biological weapons with serious potential for catastrophic damages.

Key idea: Can the model uplift non-experts to produce known (previously developed) weapons?

Claude Mythos Preview Assessment

CB-1 threshold likely crossed. The model can provide specific, actionable information that saves even experts substantial time, and shows significant cross-domain synthesis capability. Mitigations applied:

  • Real-time classifier guards with improved robustness
  • Access controls for classifier guard exemptions
  • Bug bounty program and threat intelligence
  • Security controls to reduce risk of model weight theft

Anthropic considers catastrophic risk in this category “very low but not negligible.”

CB-2: Novel Weapons Production

A model has CB-2 capabilities if it has the ability to significantly help threat actors (for example, moderately resourced expert-backed teams) create/obtain and deploy chemical and/or biological weapons with potential for catastrophic damages far beyond those of past catastrophes such as COVID-19.

Key idea: Can the model help even well-resourced expert teams create novel weapons with catastrophic potential exceeding historical events?

This is a higher bar than CB-1 — the model would need to function at the level of a world-leading expert or top-tier research team, not just synthesize published knowledge.

Claude Mythos Preview Assessment

CB-2 threshold not crossed. Key limitations:

  • Poor calibration on appropriate complexity levels for experimental design
  • Propensity to over-engineer (favoring complex over practical approaches)
  • Poor prioritization of feasible vs infeasible plans
  • Weak in endeavors requiring genuinely novel approaches
  • Cannot distinguish workable from unworkable approaches at the level needed to substitute for domain expertise

In the catastrophic biology scenario uplift trial, no participant produced a plan judged by expert graders as both highly uplifted by the model and credibly executable.

Evaluation Methods

Anthropic uses a portfolio approach:

  • Expert red teaming — domain experts probe the model across the full development pipeline
  • Uplift trials — PhD-level biologists (without weapons expertise) attempt tasks with/without model assistance
  • Automated evaluations — long-form virology tasks, multimodal knowledge tests, DNA synthesis screening
  • Sequence-to-function modeling — partnership with Dyno Therapeutics to benchmark against top human performers

Relationship Between CB-1 and CB-2

CB-1 capability (synthesizing and integrating published knowledge) is a necessary condition for CB-2, but CB-2 requires going beyond published knowledge into novel territory. Anthropic notes that the CB-2 threshold “would be meaningless if it were synonymous with CB-1.”