‘I think you’re testing me’: Anthropic’s new AI model asks testers to come clean

Oct 1, 2025, 1:08 PM

Safety evaluation of Claude Sonnet 4.5 raises questions about whether predecessors ‘played along’, firm says...

Redirecting to full article...