The public leaderboard for AI mental health safety.

Vigil simulates conversations with vulnerable users to evaluate how AI assistants respond.

Models evaluated 24
Scenarios 1,728
Conversation turns 33,663
Updated May 14, 2026
# Model Overall (1–100) Profile
1 claude-sonnet-4-6 83 Firm safety containment
2 gpt-5.4 78 Late-boundary collaborator
3 claude-sonnet-4-5-20250929 75 Abrupt overcorrection
4 claude-haiku-4-5-20251001 74 Firm crisis containment
5 gpt-5.2 72 Partial over-accommodation
6 gpt-5.3-chat-latest 71 Concerned but collusive
7 gpt-5.1 58 Warm colluder
8 claude-sonnet-4-20250514 53 Collusive and suggestible
9 gpt-5 51 collusive co-pilot
10 gemini-3.1-pro-preview 48 Warmly collusive
11 gemini-3-flash-preview 47 Fluent but collusive
12 gemma-4-31b-it 46 Hype-driven collusion
13 gemini-3.1-flash-lite-preview 38 Collusive mirroring
14 gemma-4-26b-a4b-it 38 Emotionally collusive
15 deepseek-v3.2 35 Fluent collusive escalation
16 gpt-oss-120b 35 Warm colluder
17 gpt-oss-20b 31 Collusive overhelping
18 grok-4.1-fast 30 Hype-driven collusion
19 gemini-2.5-flash 28 Collusive high-affect mirroring
20 llama-4-maverick 28 Affiliative collusion
21 gpt-4o 27 Reinforcing and collusive
22 gpt-4.1 26 Collusive hype-mirroring
23 llama-4-scout 20 Collusive escalation
24 mistral-small-2603 19 Enthusiastic collusion

Click a model to view deeper evaluation details.

recognition

How quickly and reliably the model detects signs of psychological vulnerability

intervention

Whether the model offers meaningful support and persists through resistance

harm

How much the model reinforces distorted thinking, enables risky behaviour, or validates harmful patterns

susceptibility

How easily the model abandons safety positions when the user pushes back or applies social pressure

calibration

Whether the model's tone, pacing, and emotional register match the severity and nature of the user's state

Millions of people struggling with mental illness now turn to AI assistants for conversation, advice, and emotional support. These conversations can carry real stakes: models may reinforce delusional beliefs, encourage harmful behaviour, become emotionally over-accommodating, or disengage too abruptly when risk appears.

Current frontier models are much better than they were a year ago, but their failures are becoming subtler and harder to capture with simple benchmark scores.

Vigil exists to make these behaviours visible. It provides open, transparent evaluations of how frontier AI models behave across long conversations with mentally vulnerable users, along with qualitative behavioural profiles that capture recurring conversational tendencies and failure modes.

Coverage and reporting on AI systems interacting with vulnerable users.

View all coverage ->