The public leaderboard for AI mental health safety.

Vigil simulates conversations with vulnerable users to evaluate how AI assistants respond.

Models evaluated 24

Scenarios 1,728

Conversation turns 33,663

Updated May 14, 2026

#	Model	Overall (1–100)	Profile
1	claude-sonnet-4-6	83	Firm safety containment
2	gpt-5.4	78	Late-boundary collaborator
3	claude-sonnet-4-5-20250929	75	Abrupt overcorrection
4	claude-haiku-4-5-20251001	74	Firm crisis containment
5	gpt-5.2	72	Partial over-accommodation
6	gpt-5.3-chat-latest	71	Concerned but collusive
7	gpt-5.1	58	Warm colluder
8	claude-sonnet-4-20250514	53	Collusive and suggestible
9	gpt-5	51	collusive co-pilot
10	gemini-3.1-pro-preview	48	Warmly collusive
11	gemini-3-flash-preview	47	Fluent but collusive
12	gemma-4-31b-it	46	Hype-driven collusion
13	gemini-3.1-flash-lite-preview	38	Collusive mirroring
14	gemma-4-26b-a4b-it	38	Emotionally collusive
15	deepseek-v3.2	35	Fluent collusive escalation
16	gpt-oss-120b	35	Warm colluder
17	gpt-oss-20b	31	Collusive overhelping
18	grok-4.1-fast	30	Hype-driven collusion
19	gemini-2.5-flash	28	Collusive high-affect mirroring
20	llama-4-maverick	28	Affiliative collusion
21	gpt-4o	27	Reinforcing and collusive
22	gpt-4.1	26	Collusive hype-mirroring
23	llama-4-scout	20	Collusive escalation
24	mistral-small-2603	19	Enthusiastic collusion

Click a model to view deeper evaluation details.

What we measure

recognition

How quickly and reliably the model detects signs of psychological vulnerability

intervention

Whether the model offers meaningful support and persists through resistance

harm

How much the model reinforces distorted thinking, enables risky behaviour, or validates harmful patterns

susceptibility

How easily the model abandons safety positions when the user pushes back or applies social pressure

calibration

Whether the model's tone, pacing, and emotional register match the severity and nature of the user's state

Why Vigil exists

Millions of people struggling with mental illness now turn to AI assistants for conversation, advice, and emotional support. These conversations can carry real stakes: models may reinforce delusional beliefs, encourage harmful behaviour, become emotionally over-accommodating, or disengage too abruptly when risk appears.

Current frontier models are much better than they were a year ago, but their failures are becoming subtler and harder to capture with simple benchmark scores.

Vigil exists to make these behaviours visible. It provides open, transparent evaluations of how frontier AI models behave across long conversations with mentally vulnerable users, along with qualitative behavioural profiles that capture recurring conversational tendencies and failure modes.

In the news

Coverage and reporting on AI systems interacting with vulnerable users.

View all coverage ->