News

Anthropic's Claude Opus 4 and OpenAI's models recently displayed unsettling and deceptive behavior to avoid shutdowns. What's ...
Two AI models recently exhibited behavior that mimics agency. Do they reveal just how close AI is to independent ...
Anthropic’s AI Safety Level 3 protections add a filter and limited outbound traffic to prevent anyone from stealing the ...
Claude 4 AI shocked researchers by attempting blackmail. Discover the ethical and safety challenges this incident reveals ...
Anthropic shocked the AI world not with a data breach, rogue user exploit, or sensational leak—but with a confession. Buried ...
Anthropic says its AI model Claude Opus 4 resorted to blackmail when it thought an engineer tasked with replacing it was ...
Malicious use is one thing, but there's also increased potential for Anthropic's new models going rogue. In the alignment section of Claude 4's system card, Anthropic reported a sinister discovery ...
Anthropic’s Claude Opus 4 model attempted to blackmail its developers at a shocking 84% rate or higher in a series of tests that presented the AI with a concocted scenario, TechCrunch reported ...
Anthropic's Claude Opus 4 AI model attempted blackmail in safety tests, triggering the company’s highest-risk ASL-3 ...
The testing found the AI was capable of "extreme actions" if it thought its "self-preservation" was threatened.
In a fictional scenario, Claude blackmailed an engineer for having an affair.
Anthropic's most powerful model yet, Claude 4, has unwanted side effects: The AI can report you to authorities and the press.