Claude 4 Performance Records

News

Calendar on MSN15d

Claude Opus 4 achieves record performance in AI coding capabilities

Anthropic’s latest AI model, Claude Opus 4, has surpassed OpenAI’s GPT-4.1 in coding abilities, marking a significant shift ...

28d

Anthropic overtakes OpenAI: Claude Opus 4 codes seven hours nonstop, sets record SWE-Bench score and reshapes enterprise AI

Anthropic's Claude Opus 4 outperforms OpenAI's GPT-4.1 with unprecedented seven-hour autonomous coding sessions and record-breaking 72.5% SWE-bench score, transforming AI from quick-response tool to ...

Investing28d

Anthropic unveils Claude 4 models, set benchmark in AI performance

Claude Opus 4, now alleged as the world’s best coding model, delivers a record-setting 72.5% on SWE-bench and 43.2% on Terminal-bench, outperforming all competitors on long-running and complex ...

28d

New Claude 4 AI model refactored code for 7 hours straight

In particular, that marathon refactoring claim reportedly comes from Rakuten, a Japanese tech services conglomerate that "validated [Claude's] capabilities with a demanding open-source refactor ...

Geeky Gadgets24d

Claude 4 Sonnet & Opus AI Models Coding Performance Tested

Claude 4 Opus is specifically designed to handle high-performance, long-duration tasks. It excels in advanced reasoning, memory retention, and multifile code comprehension, making it a robust ...

PC Magazine28d

Anthropic's Claude 4 Models Can Write Complex Code for You

and performance on agent tasks—like Opus 4 creating a 'Navigation Guide' while playing Pokémon," Anthropic says. Claude Opus 4 records key information to help improve its game play.

MacRumors28d

Claude 4 Debuts with Two New Models Focused on Coding and Reasoning

Claude Opus 4 is designed for coding among other tasks, and it offers sustained performance for complex, long-running tasks and agent workflows. Claude Opus 4 is Anthropic's most powerful model to ...

Mashable29d

Anthropic's new Claude Opus 4 can run autonomously for seven hours straight

According to Rakuten, which got early access to the model, Claude Opus 4 ran "independently for seven hours with sustained performance." Claude Opus is Anthropic's largest version of the model ...

Geeky Gadgets24d

Can Claude 4 Replace Human Writers? Full Analysis of Its Performance

From crafting compelling prose to tackling genre-specific challenges, Claude 4’s performance raises both excitement and skepticism. For writers, marketers, and content creators alike ...

Investing28d

Anthropic unveils Claude 4 models, set benchmark in AI performance

The release features Claude Opus 4 and Claude Sonnet 4, both of which raise the bar for AI reasoning, coding capabilities, and sustained agentic performance. Claude Opus 4, now alleged as the world’s ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results