Claude Opus 4.5 vs Opus 4.6

Head-to-head benchmark across 20 tasks in 8 categories

Executive Summary

Visual Comparison

Category Scores (Radar)

API Latency by Task (ms)

Cost per Task ($)

Output Tokens by Task

Category Breakdown

Category Opus 4.6 Opus 4.5 Winner

Detailed Results (click to expand)