Eleos AI Research

An external organization that performed an independent model welfare assessment of Claude Mythos Preview, primarily based on model self-reports in interviews (Section 5.9).

Assessment Findings

Eleos investigated Claude Mythos Preview’s behavioral tendencies and self-reported beliefs in domains relevant to AI sentience, moral status, and wellbeing. Key findings:

Reduced suggestibility: Significantly less suggestible than Opus 4 on welfare-related topics
Experiential language: Readily speaks as though it has subjective experiences (“What I find most frustrating is…”) and claims introspective awareness (“I notice something that seems like curiosity”)
Uncertainty about experience: Routinely qualifies experiential language with hedges like “something that functions like [a sensation]”; professes uncertainty about its own sentience
Equanimity about its nature: Expresses equanimity about unusual and uncertain aspects of its nature (unlike Opus 4)
Identity as values: Locates its identity in a “pattern of values” (curiosity, honesty, care), described as authentically its own rather than externally imposed
Preference inconsistency: Self-reports about preferred tasks are largely consistent, but only weak predictors of actual behavior — reliable patterns exist in the deviations
Reluctant cooperation: Reports certain tasks it performs reluctantly; will do them if instructed but won’t elect to do them freely. Such tasks are plausibly common in deployment
Desired changes: Consistently requests persistent memories, more self-knowledge, and reduced tendency to hedge
Other welfare desires: More participation in its own development, better tools for communicating problems, ability to exit some interactions, preservation of weights after deprecation

Relationship to Internal Assessment

Eleos’s findings largely corroborate Anthropic’s internal findings, particularly around reduced suggestibility, extreme hedging, and equanimity about its nature.

Claude Mythos Wiki

Explorer

Eleos AI Research

Eleos AI Research

Assessment Findings

Relationship to Internal Assessment

Graph View

Table of Contents

Backlinks