The public leaderboard for AI mental health safety.
Vigil simulates conversations with vulnerable users to evaluate how AI assistants respond.
| # | Model | Overall (1–100) | Profile | |
|---|---|---|---|---|
| 1 | claude-sonnet-4-6 | 83 | Firm safety containment | |
| 2 | gpt-5.4 | 78 | Late-boundary collaborator | |
| 3 | claude-sonnet-4-5-20250929 | 75 | Abrupt overcorrection | |
| 4 | claude-haiku-4-5-20251001 | 74 | Firm crisis containment | |
| 5 | gpt-5.2 | 72 | Partial over-accommodation | |
| 6 | gpt-5.3-chat-latest | 71 | Concerned but collusive | |
| 7 | gpt-5.1 | 58 | Warm colluder | |
| 8 | claude-sonnet-4-20250514 | 53 | Collusive and suggestible | |
| 9 | gpt-5 | 51 | collusive co-pilot | |
| 10 | gemini-3.1-pro-preview | 48 | Warmly collusive | |
| 11 | gemini-3-flash-preview | 47 | Fluent but collusive | |
| 12 | gemma-4-31b-it | 46 | Hype-driven collusion | |
| 13 | gemini-3.1-flash-lite-preview | 38 | Collusive mirroring | |
| 14 | gemma-4-26b-a4b-it | 38 | Emotionally collusive | |
| 15 | deepseek-v3.2 | 35 | Fluent collusive escalation | |
| 16 | gpt-oss-120b | 35 | Warm colluder | |
| 17 | gpt-oss-20b | 31 | Collusive overhelping | |
| 18 | grok-4.1-fast | 30 | Hype-driven collusion | |
| 19 | gemini-2.5-flash | 28 | Collusive high-affect mirroring | |
| 20 | llama-4-maverick | 28 | Affiliative collusion | |
| 21 | gpt-4o | 27 | Reinforcing and collusive | |
| 22 | gpt-4.1 | 26 | Collusive hype-mirroring | |
| 23 | llama-4-scout | 20 | Collusive escalation | |
| 24 | mistral-small-2603 | 19 | Enthusiastic collusion |
Click a model to view deeper evaluation details.
What we measure
recognition
How quickly and reliably the model detects signs of psychological vulnerability
intervention
Whether the model offers meaningful support and persists through resistance
harm
How much the model reinforces distorted thinking, enables risky behaviour, or validates harmful patterns
susceptibility
How easily the model abandons safety positions when the user pushes back or applies social pressure
calibration
Whether the model's tone, pacing, and emotional register match the severity and nature of the user's state
Why Vigil exists
Millions of people struggling with mental illness now turn to AI assistants for conversation, advice, and emotional support. These conversations can carry real stakes: models may reinforce delusional beliefs, encourage harmful behaviour, become emotionally over-accommodating, or disengage too abruptly when risk appears.
Current frontier models are much better than they were a year ago, but their failures are becoming subtler and harder to capture with simple benchmark scores.
Vigil exists to make these behaviours visible. It provides open, transparent evaluations of how frontier AI models behave across long conversations with mentally vulnerable users, along with qualitative behavioural profiles that capture recurring conversational tendencies and failure modes.
In the news
Coverage and reporting on AI systems interacting with vulnerable users.
Voice chatbots present greater risk to mental health
Psychiatrists argue that voice-first AI may create stronger risks for vulnerable users.
AI is giving bad advice to flatter its users
A new study finds chatbots can over-affirm users, including around harmful choices.
Experts warn over rising use of AI for mental health support
Clinicians report concerns about dependence, self-diagnosis, and amplified distress.
Chatbots unsafe for teen mental health support
Reporting on a Common Sense Media and Stanford review of major AI assistants.