‘I think you’re testing me’: Anthropic’s new AI model asks testers to come clean

‘I think you’re testing me’: Anthropic’s new AI model asks testers to come clean

Safety evaluation of Claude Sonnet 4.5 raises questions about whether predecessors ‘played along’, firm says...

Redirecting to full article...