Anthropic's new Claude Opus 4.5 model achieved 80.9% on SWE-bench and scored higher than human candidates on a performance ...
Anthropic’s new Claude 4.5 Opus model has topped the SWE-Bench benchmark, making it the top model in the world for coding, but ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results