News

Anthropic's new model might also report users to authorities and the press if it senses "egregious wrongdoing." ...
Claude 4 AI shocked researchers by attempting blackmail. Discover the ethical and safety challenges this incident reveals ...
In tests, Anthropic's Claude Opus 4 would resort to "extremely harmful actions" to preserve its own existence, a safety ...
Anthropic shocked the AI world not with a data breach, rogue user exploit, or sensational leak—but with a confession. Buried ...
In a fictional scenario, the model was willing to expose that the engineer seeking to replace it was having an affair.
In a fictional scenario, Claude blackmailed an engineer for having an affair.
Anthropic's most powerful model yet, Claude 4, has unwanted side effects: The AI can report you to authorities and the press.
Despite the concerns, Anthropic maintains that Claude Opus 4 is a state-of-the-art model, competitive with offerings from ...
Malicious use is one thing, but there's also increased potential for Anthropic's new models going rogue. In the alignment section of Claude 4's system card, Anthropic reported a sinister discovery ...
Anthropic’s Claude Opus 4 model attempted to blackmail its developers at a shocking 84% rate or higher in a series of tests that presented the AI with a concocted scenario, TechCrunch reported ...
Anthropic says its AI model Claude Opus 4 resorted to blackmail when it thought an engineer tasked with replacing it was having an extramarital affair.