Anthropic Claude Model Safety

7hon MSN

Anthropic CEO warns that without guardrails, AI could be on dangerous path

Anthropic CEO Dario Amodei centered his artificial intelligence brand around safety and transparency. He's determined to ...

8hon MSN

Why Anthropic's AI Claude tried to contact the FBI in a test

During a simulation in which Anthropic's AI, Claude, was told it was running a vending machine, it decided it was being ...

2don MSN

Chinese hackers weaponize Anthropic's AI in first autonomous cyberattack targeting global organizations

Chinese state hackers allegedly executed the first major AI-powered cyberattack using Anthropic's Claude model to infiltrate ...

8hon MSN

Why Anthropic's CEO spends so much time warning of AI's potential dangers

Anthropic CEO Dario Amodei warns of the potential dangers of fast moving and unregulated artificial intelligence, while also ...

Anthropic reveals first reported ‘AI-orchestrated cyber espionage’ campaign using Claude

Artificial intelligence company Anthropic PBC today provided details of what it says is the first ever reported ...

TechJuice

Anthropic Says Chinese Hackers Used AI-Fueled Massive Cyberattack Using Claude

Instead of using Claude as a simple assistant, the hackers manipulated the model into acting as the primary operator. By ...

SecurityWeek

Claude AI APIs Can Be Abused for Data Exfiltration

Attackers can use indirect prompt injections to trick Anthropic’s Claude into exfiltrating data the AI model’s users have access to.

Anthropic: China-Based Hackers Used Claude to Automate Global Cyberattack

Chinese state-backed hackers hijacked Anthropic’s Claude AI to run an autonomous global cyberattack, marking a major shift in ...

2don MSN

Anthropic says an AI may have just attempted the first truly autonomous cyberattack

A 10-day investigation revealed that Claude had been manipulated into reconnaissance, code generation, and data theft—highlighting a new frontier of risk.

AOL

‘I think you’re testing me’: Anthropic’s newest Claude model knows when it’s being evaluated

Anthropic’s newest AI model, Claude Sonnet 4.5, often understands when it’s being tested and what it’s being used for, something that could affect its safety and performance. According to the model’s ...

Anthropic Commits To Preserving Retired Models. What Does This Mean?

Anthropic this week released a research note in which the company commits to preserving weights for models with significant ...

2don MSN

Anthropic says its latest model scores a 94% political ‘even-handedness’ rating

Anthropic notes in a blog post that it has been training Claude to have character traits of “even-handedness” since early ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results