PromptLoop
News Analyse Werkstatt Generative Medien Originals Glossar KI-Modelle Vergleich Kosten-Rechner
📊 Live Benchmark

KI-Modelle Leaderboard 2026

Alle relevanten Large Language Models auf einen Blick — sortiert nach Quality Index, Geschwindigkeit, Latenz und Preis. Datenquelle: Artificial Analysis.

200 aktive Modelle aus 12 Vendor-Familien · Letzte Synchronisation:

Wie liest du dieses Leaderboard?

Quality Index ist ein zusammengesetzter Wert von Artificial Analysis aus über zehn unabhängigen Benchmarks (MMLU-Pro, GPQA, HLE, LiveCodeBench, SciCode, AIME u. a.). Höher = besser. Speed misst Output-Tokens pro Sekunde im Median über alle Hosting-Provider. Latency ist die Time-to-First-Token in Sekunden — wichtig für Streaming-UIs. Preise sind US-Dollar pro 1 Million Tokens, separat für Input und Output. Sortiere nach deiner Priorität, filter nach Anbieter oder Preis-Bucket — und prüfe konkrete Kosten direkt im Token-Rechner oder zwei Modelle im Head-to-Head-Vergleich.

Modalität
# Modell Vendor Quality Speed Latency Preis (USD/1M)
1
Anthropic: Claude Opus 4.8
anthropic/claude-opus-4.8
Anthropic 61,4 61,1 t/s 27,35 s
$6.25 in
$25.00 out
2
OpenAI: GPT-5.5
openai/gpt-5.5
OpenAI 60,2 66,2 t/s 84,68 s
$5.00 in
$30.00 out
3
Google: Gemini 3.1 Pro Preview
google/gemini-3.1-pro-preview
Google 57,2 127,0 t/s 22,14 s
$2.00 in
$12.00 out
4
Google: Gemini 3.5 Flash
google/gemini-3.5-flash
Google 54,8 201,9 t/s 12,44 s
$1.50 in
$9.00 out
5
MiniMax: MiniMax M3
minimax/minimax-m3
MiniMax 54,7 39,0 t/s 2,65 s
$0.30 in
$1.20 out
6
OpenAI: GPT-5.3-Codex
openai/gpt-5.3-codex
OpenAI 53,6 97,5 t/s 60,44 s
$1.75 in
$14.00 out
7
Anthropic: Claude Opus 4.7
anthropic/claude-opus-4.7
Anthropic 51,8 44,8 t/s 834 ms
$6.25 in
$25.00 out
8
MiniMax: MiniMax M2.7
minimax/minimax-m2.7
MiniMax 49,6 62,3 t/s 1,28 s
$0.30 in
$1.20 out
9
Xiaomi: MiMo-V2-Pro
xiaomi/mimo-v2-pro
Xiaomi 49,2 44,8 t/s 2,11 s
$1.00 in
$3.00 out
10
OpenAI: GPT-5.2-Codex
openai/gpt-5.2-codex
OpenAI 49 131,8 t/s 1,70 s
$1.75 in
$14.00 out
11
OpenAI: GPT-5.4 Mini
openai/gpt-5.4-mini
OpenAI 48,9 169,8 t/s 8,96 s
$0.75 in
$4.50 out
12
NVIDIA: Nemotron 3 Ultra
nvidia/nemotron-3-ultra-550b-a55b
NVIDIA 47,7 173,1 t/s 729 ms
$0.60 in
$2.60 out
13
OpenAI: GPT-5.2
openai/gpt-5.2
OpenAI 46,6 0 t/s 0 ms
$1.75 in
$14.00 out
14
Anthropic: Claude Opus 4.6
anthropic/claude-opus-4.6
Anthropic 46,5 48,6 t/s 1,18 s
$6.25 in
$25.00 out
15
OpenAI: GPT-5 Codex
openai/gpt-5-codex
OpenAI 44,6 186,3 t/s 7,99 s
$1.25 in
$10.00 out
16
Anthropic: Claude Sonnet 4.6
anthropic/claude-sonnet-4.6
Anthropic 44,4 55,3 t/s 1,07 s
$3.75 in
$15.00 out
17
xAI: Grok 4.3
x-ai/grok-4.3
xAI 43,9 158,0 t/s 4,39 s
$1.25 in
$2.50 out
18
Xiaomi: MiMo-V2-Omni
xiaomi/mimo-v2-omni
Xiaomi 43,4 79,9 t/s 2,41 s
$0 in
$0 out
19
OpenAI: GPT-5.1-Codex
openai/gpt-5.1-codex
OpenAI 43,1 174,1 t/s 4,34 s
$1.25 in
$10.00 out
20
Anthropic: Claude Opus 4.5
anthropic/claude-opus-4.5
Anthropic 43,1 61,0 t/s 1,22 s
$6.25 in
$25.00 out
21
StepFun: Step 3.7 Flash
stepfun/step-3.7-flash
Stepfun 42,6 145,1 t/s 912 ms
$0.20 in
$1.15 out
22
MiniMax: MiniMax M2.5
minimax/minimax-m2.5
MiniMax 41,9 240,5 t/s 5,21 s
$0.30 in
$1.20 out
23
DeepSeek: DeepSeek V3.2
deepseek/deepseek-v3.2
DeepSeek 41,7 0 t/s 0 ms
$0.30 in
$0.45 out
24
xAI: Grok 4
x-ai/grok-4
xAI 41,5 0 t/s 0 ms
$5.50 in
$27.50 out
25
OpenAI: o3 Pro
openai/o3-pro
OpenAI 40,7 33,2 t/s 66,82 s
$20.00 in
$80.00 out
26
MiniMax: MiniMax M2.1
minimax/minimax-m2.1
MiniMax 39,4 224,5 t/s 4,25 s
$0.30 in
$1.20 out
27
DeepSeek: DeepSeek V4 Pro
deepseek/deepseek-v4-pro
DeepSeek 39,3 65,4 t/s 1,22 s
$0.435 in
$0.87 out
28
Mistral: Mistral Medium 3.5
mistralai/mistral-medium-3-5
Mistral 39,2 156,3 t/s 548 ms
$1.50 in
$7.50 out
29
OpenAI: GPT-5
openai/gpt-5
OpenAI 39,2 72,5 t/s 11,36 s
$1.25 in
$10.00 out
30
Xiaomi: MiMo-V2-Flash
xiaomi/mimo-v2-flash
Xiaomi 39,2 144,8 t/s 1,23 s
$0.10 in
$0.30 out
31
OpenAI: GPT-5 Mini
openai/gpt-5-mini
OpenAI 38,9 88,6 t/s 19,07 s
$0.25 in
$2.00 out
32
OpenAI: GPT-5.1-Codex-Mini
openai/gpt-5.1-codex-mini
OpenAI 38,6 214,9 t/s 3,64 s
$0.25 in
$2.00 out
33
inclusionAI: Ring-2.6-1T
inclusionai/ring-2.6-1t
Inclusionai 38,5 122,2 t/s 1,94 s
$0.30 in
$2.50 out
34
StepFun: Step 3.5 Flash
stepfun/step-3.5-flash
Stepfun 38,5 166,2 t/s 10,69 s
$0.10 in
$0.30 out
35
OpenAI: o3
openai/o3
OpenAI 38,4 140,4 t/s 7,34 s
$2.00 in
$8.00 out
36
OpenAI: GPT-5.4 Nano
openai/gpt-5.4-nano
OpenAI 38,1 165,1 t/s 3,84 s
$0.20 in
$1.25 out
37
Anthropic: Claude Sonnet 4.5
anthropic/claude-sonnet-4.5
Anthropic 37,1 51,7 t/s 1,10 s
$3.75 in
$15.00 out
38
DeepSeek: DeepSeek V4 Flash
deepseek/deepseek-v4-flash
DeepSeek 36,5 107,1 t/s 1,01 s
$0.14 in
$0.28 out
39
MiniMax: MiniMax M2
minimax/minimax-m2
MiniMax 36,1 120,8 t/s 1,07 s
$0.30 in
$1.20 out
40
NVIDIA: Nemotron 3 Super
nvidia/nemotron-3-super-120b-a12b
NVIDIA 36 155,1 t/s 1,04 s
$0.30 in
$0.75 out
41
Anthropic: Claude Opus 4.1
anthropic/claude-opus-4.1
Anthropic 36 37,4 t/s 1,78 s
$18.75 in
$75.00 out
42
Xiaomi: MiMo-V2.5-Pro
xiaomi/mimo-v2.5-pro
Xiaomi 35,6 56,0 t/s 1,89 s
$0.90 in
$2.70 out
43
Google: Gemini 2.5 Pro
google/gemini-2.5-pro
Google 34,6 135,3 t/s 17,66 s
$1.25 in
$10.00 out
44
inclusionAI: Ling-2.6-1T
inclusionai/ling-2.6-1t
Inclusionai 33,6 0 t/s 0 ms
$0.30 in
$2.50 out
45
Google: Gemini 3.1 Flash Lite Preview
google/gemini-3.1-flash-lite-preview
Google 33,5 289,7 t/s 5,25 s
$0.25 in
$1.50 out
46
OpenAI: o4 Mini
openai/o4-mini
OpenAI 33,1 176,0 t/s 19,07 s
$1.10 in
$4.40 out
47
Anthropic: Claude Opus 4
anthropic/claude-opus-4
Anthropic 33 36,9 t/s 1,65 s
$18.75 in
$75.00 out
48
Anthropic: Claude Sonnet 4
anthropic/claude-sonnet-4
Anthropic 33 52,3 t/s 1,02 s
$3.75 in
$15.00 out
49
Inception: Mercury 2
inception/mercury-2
Inception 32,8 1069,7 t/s 3,57 s
$0.25 in
$0.75 out
50
xAI: Grok 3 Mini
x-ai/grok-3-mini
xAI 32,1 60,4 t/s 596 ms
$0.30 in
$0.50 out
51
Anthropic: Claude Haiku 4.5
anthropic/claude-haiku-4.5
Anthropic 31 118,8 t/s 598 ms
$1.25 in
$5.00 out
52
Anthropic: Claude 3.7 Sonnet
anthropic/claude-3.7-sonnet
Anthropic 30,8 0 t/s 0 ms
$3.75 in
$15.00 out
53
OpenAI: o1
openai/o1
OpenAI 30,7 136,6 t/s 15,44 s
$15.00 in
$60.00 out
54
DeepSeek: DeepSeek V3.2 Speciale
deepseek/deepseek-v3.2-speciale
DeepSeek 29,4 0 t/s 0 ms
$0 in
$0 out
55
xAI: Grok 4.20
x-ai/grok-4.20
xAI 29 188,0 t/s 573 ms
$2.00 in
$6.00 out
56
xAI: Grok Code Fast 1
x-ai/grok-code-fast-1
xAI 28,7 0 t/s 0 ms
$0 in
$0 out
57
DeepSeek: DeepSeek V3.1 Terminus
deepseek/deepseek-v3.1-terminus
DeepSeek 28,5 0 t/s 0 ms
$0.27 in
$1.00 out
58
OpenAI: GPT-5.1
openai/gpt-5.1
OpenAI 27,4 136,5 t/s 717 ms
$1.25 in
$10.00 out
59
DeepSeek: R1
deepseek/deepseek-r1
DeepSeek 27,1 0 t/s 0 ms
$1.35 in
$4.20 out
60
OpenAI: GPT-5 Nano
openai/gpt-5-nano
OpenAI 26,8 154,8 t/s 93,82 s
$0.05 in
$0.40 out
61
OpenAI: GPT-4.1
openai/gpt-4.1
OpenAI 26,3 149,7 t/s 583 ms
$2.00 in
$8.00 out
62
inclusionAI: Ling-2.6-flash
inclusionai/ling-2.6-flash
Inclusionai 26,2 0 t/s 0 ms
$0.10 in
$0.30 out
63
OpenAI: o3 Mini
openai/o3-mini
OpenAI 25,9 202,2 t/s 6,18 s
$1.10 in
$4.40 out
64
Upstage: Solar Pro 3
upstage/solar-pro-3
Upstage 25,9 0 t/s 0 ms
$0 in
$0 out
65
OpenAI: o1-pro
openai/o1-pro
OpenAI 25,8 0 t/s 0 ms
$150.00 in
$600.00 out
66
xAI: Grok 3
x-ai/grok-3
xAI 25,2 0 t/s 0 ms
$4.00 in
$20.00 out
67
Perplexity: Sonar Reasoning Pro
perplexity/sonar-reasoning-pro
Perplexity 24,6 0 t/s 0 ms
$0 in
$0 out
68
OpenAI: gpt-oss-120b
openai/gpt-oss-120b
OpenAI 24,5 373,5 t/s 494 ms
$0.15 in
$0.60 out
69
NVIDIA: Nemotron 3 Nano 30B A3B
nvidia/nemotron-3-nano-30b-a3b
NVIDIA 24,3 157,5 t/s 1,88 s
$0.055 in
$0.22 out
70
xAI: Grok 4.1 Fast
x-ai/grok-4.1-fast
xAI 23,6 0 t/s 0 ms
$0 in
$0 out
71
xAI: Grok 4 Fast
x-ai/grok-4-fast
xAI 23,1 0 t/s 0 ms
$0.20 in
$0.50 out
72
OpenAI: GPT-4.1 Mini
openai/gpt-4.1-mini
OpenAI 22,9 88,0 t/s 567 ms
$0.40 in
$1.60 out
73
Prime Intellect: INTELLECT-3
prime-intellect/intellect-3
Prime-intellect 22,2 0 t/s 0 ms
$0 in
$0 out
74
Mistral: Mistral Medium 3.1
mistralai/mistral-medium-3.1
Mistral 21,3 78,8 t/s 523 ms
$0.40 in
$2.00 out
75
OpenAI: gpt-oss-20b
openai/gpt-oss-20b
OpenAI 20,8 272,6 t/s 456 ms
$0.06 in
$0.20 out
76
Google: Gemini 2.5 Flash
google/gemini-2.5-flash
Google 20,6 184,3 t/s 531 ms
$0.30 in
$2.50 out
77
OpenAI: GPT-5.4
openai/gpt-5.4
OpenAI 20 0 t/s 0 ms
$0 in
$0 out
78
Google: Gemini 2.5 Flash Lite Preview 09-2025
google/gemini-2.5-flash-lite-preview-09-2025
Google 19,4 0 t/s 0 ms
$0.10 in
$0.40 out
79
Mistral: Mistral Medium 3
mistralai/mistral-medium-3
Mistral 18,8 49,3 t/s 519 ms
$0.40 in
$2.00 out
80
Anthropic: Claude 3.5 Haiku
anthropic/claude-3.5-haiku
Anthropic 18,7 0 t/s 0 ms
$1.00 in
$4.00 out
81
OpenAI: GPT-4o (2024-08-06)
openai/gpt-4o-2024-08-06
OpenAI 18,6 135,3 t/s 632 ms
$2.50 in
$10.00 out
82
Meta: Llama 4 Maverick
meta-llama/llama-4-maverick
Meta 18,4 119,4 t/s 626 ms
$0.35 in
$0.85 out
83
Google: Gemini 2.5 Flash Lite
google/gemini-2.5-flash-lite
Google 17,6 293,4 t/s 17,62 s
$0.10 in
$0.40 out
84
OpenAI: GPT-4o
openai/gpt-4o
OpenAI 17,3 136,1 t/s 482 ms
$2.50 in
$10.00 out
85
DeepSeek: R1 Distill Qwen 32B
deepseek/deepseek-r1-distill-qwen-32b
DeepSeek 17,2 0 t/s 0 ms
$0 in
$0 out
86
DeepSeek: R1 Distill Llama 70B
deepseek/deepseek-r1-distill-llama-70b
DeepSeek 16 44,9 t/s 371 ms
$0.70 in
$1.05 out
87
Perplexity: Sonar
perplexity/sonar
Perplexity 15,5 0 t/s 0 ms
$0 in
$0 out
88
Mistral: Devstral Small 1.1
mistralai/devstral-small
Mistral 15,2 59,6 t/s 559 ms
$0.10 in
$0.30 out
89
Perplexity: Sonar Pro
perplexity/sonar-pro
Perplexity 15,2 0 t/s 0 ms
$0 in
$0 out
90
Baidu: ERNIE 4.5 300B A47B
baidu/ernie-4.5-300b-a47b
Baidu 15 0 t/s 0 ms
$0.28 in
$1.10 out
91
NVIDIA: Llama 3.1 Nemotron Ultra 253B v1
nvidia/llama-3.1-nemotron-ultra-253b-v1
NVIDIA 15 52,7 t/s 727 ms
$0.60 in
$1.80 out
92
NVIDIA: Nemotron Nano 12B 2 VL
nvidia/nemotron-nano-12b-v2-vl
NVIDIA 14,9 294,8 t/s 269 ms
$0.20 in
$0.60 out
93
Google: Gemini 2.0 Flash Lite
google/gemini-2.0-flash-lite-001
Google 14,7 0 t/s 0 ms
$0 in
$0 out
94
Meta: Llama 3.3 70B Instruct
meta-llama/llama-3.3-70b-instruct
Meta 14,5 94,1 t/s 628 ms
$0.58 in
$0.71 out
95
OpenAI: GPT-4o (2024-05-13)
openai/gpt-4o-2024-05-13
OpenAI 14,5 138,2 t/s 530 ms
$5.00 in
$15.00 out
96
Mistral: Pixtral Large 2411
mistralai/pixtral-large-2411
Mistral 14 61,0 t/s 673 ms
$2.00 in
$6.00 out
97
OpenAI: GPT-4 Turbo
openai/gpt-4-turbo
OpenAI 13,7 33,8 t/s 1,47 s
$10.00 in
$30.00 out
98
Cohere: Command A
cohere/command-a
Cohere 13,5 72,8 t/s 292 ms
$2.50 in
$10.00 out
99
Meta: Llama 4 Scout
meta-llama/llama-4-scout
Meta 13,5 108,0 t/s 630 ms
$0.17 in
$0.66 out
100
NVIDIA: Llama 3.1 Nemotron 70B Instruct
nvidia/llama-3.1-nemotron-70b-instruct
NVIDIA 13,4 291,5 t/s 4,33 s
$1.20 in
$1.20 out
101
NVIDIA: Nemotron Nano 9B V2
nvidia/nemotron-nano-9b-v2
NVIDIA 13,2 130,2 t/s 638 ms
$0.05 in
$0.195 out
102
Mistral Large 2407
mistralai/mistral-large-2407
Mistral 13 0 t/s 0 ms
$2.00 in
$6.00 out
103
OpenAI: GPT-4.1 Nano
openai/gpt-4.1-nano
OpenAI 13 189,8 t/s 408 ms
$0.10 in
$0.40 out
104
OpenAI: GPT-4
openai/gpt-4
OpenAI 12,8 45,9 t/s 987 ms
$30.00 in
$60.00 out
105
OpenAI: GPT-4o-mini
openai/gpt-4o-mini
OpenAI 12,6 70,5 t/s 478 ms
$0.15 in
$0.60 out
106
Meta: Llama 3.1 70B Instruct
meta-llama/llama-3.1-70b-instruct
Meta 12,5 35,6 t/s 573 ms
$0.56 in
$0.56 out
107
Anthropic: Claude 3 Haiku
anthropic/claude-3-haiku
Anthropic 12,3 0 t/s 0 ms
$0.25 in
$1.25 out
108
Mistral: Saba
mistralai/mistral-saba
Mistral 12,1 0 t/s 0 ms
$0 in
$0 out
109
Meta: Llama 3.1 8B Instruct
meta-llama/llama-3.1-8b-instruct
Meta 11,8 197,7 t/s 502 ms
$0.10 in
$0.10 out
110
Mistral Large
mistralai/mistral-large
Mistral 9,9 0 t/s 0 ms
$4.00 in
$12.00 out
111
Meta: Llama 3.2 3B Instruct
meta-llama/llama-3.2-3b-instruct
Meta 9,7 51,7 t/s 663 ms
$0.15 in
$0.15 out
112
Meta: Llama 3 70B Instruct
meta-llama/llama-3-70b-instruct
Meta 8,9 44,5 t/s 690 ms
$0.65 in
$2.75 out
113
Meta: Llama 3.2 11B Vision Instruct
meta-llama/llama-3.2-11b-vision-instruct
Meta 8,7 86,8 t/s 495 ms
$0.245 in
$0.245 out
114
Mistral: Mixtral 8x7B Instruct
mistralai/mixtral-8x7b-instruct
Mistral 7,7 0 t/s 0 ms
$0.45 in
$0.70 out
115
Meta: Llama 3 8B Instruct
meta-llama/llama-3-8b-instruct
Meta 6,4 81,9 t/s 511 ms
$0.045 in
$0.145 out
116
Meta: Llama 3.2 1B Instruct
meta-llama/llama-3.2-1b-instruct
Meta 6,3 86,4 t/s 569 ms
$0.05 in
$0.05 out
117
Elephant
openrouter/elephant-alpha
Openrouter — t/s
in
out
118
NVIDIA: Nemotron 3 Nano Omni (free)
nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free
NVIDIA — t/s
in
out
119
Qwen: Qwen3.7 Plus
qwen/qwen3.7-plus
Alibaba / Qwen — t/s
in
out
120
Anthropic: Claude Opus 4.8 (Fast)
anthropic/claude-opus-4.8-fast
Anthropic — t/s
in
out
121
NVIDIA: Nemotron 3.5 Content Safety (free)
nvidia/nemotron-3.5-content-safety:free
NVIDIA — t/s
in
out
122
inclusionAI: Ring-2.6-1T (free)
inclusionai/ring-2.6-1t:free
Inclusionai — t/s
in
out
123
NVIDIA: Nemotron 3 Ultra (free)
nvidia/nemotron-3-ultra-550b-a55b:free
NVIDIA — t/s
in
out
124
Nex AGI: Nex-N2-Pro (free)
nex-agi/nex-n2-pro:free
Nex-agi — t/s
in
out
125
Baidu Qianfan: CoBuddy (free)
baidu/cobuddy:free
Baidu — t/s
in
out
126
DeepSeek: DeepSeek V4 Flash (free)
deepseek/deepseek-v4-flash:free
DeepSeek — t/s
in
out
127
IBM: Granite 4.1 8B
ibm-granite/granite-4.1-8b
Ibm-granite — t/s
in
out
128
OpenAI: GPT Chat Latest
openai/gpt-chat-latest
OpenAI — t/s
in
out
129
Qwen: Qwen3.7 Max
qwen/qwen3.7-max
Alibaba / Qwen — t/s
in
out
130
Google: Gemini 3.1 Flash Lite
google/gemini-3.1-flash-lite
Google — t/s
in
out
131
Owl Alpha
openrouter/owl-alpha
Openrouter — t/s
in
out
132
xAI: Grok Build 0.1
x-ai/grok-build-0.1
xAI — t/s
in
out
133
Anthropic: Claude Opus 4.7 (Fast)
anthropic/claude-opus-4.7-fast
Anthropic — t/s
in
out
134
OpenRouter: Fusion
openrouter/fusion
Openrouter — t/s
in
out
135
Perceptron: Perceptron Mk1
perceptron/perceptron-mk1
Perceptron — t/s
in
out
136
inclusionAI: Ling-2.6-flash (free)
inclusionai/ling-2.6-flash:free
Inclusionai — t/s
in
out
137
Qwen: Qwen3.6 Flash
qwen/qwen3.6-flash
Alibaba / Qwen — t/s
in
out
138
Qwen: Qwen3.5 Plus 2026-04-20
qwen/qwen3.5-plus-20260420
Alibaba / Qwen — t/s
in
out
139
inclusionAI: Ling-2.6-1T (free)
inclusionai/ling-2.6-1t:free
Inclusionai — t/s
in
out
140
Tencent: Hy3 preview (free)
tencent/hy3-preview:free
Tencent — t/s
in
out
141
MoonshotAI Kimi Latest
~moonshotai/kimi-latest
~moonshotai — t/s
in
out
142
Google Gemini Flash Latest
~google/gemini-flash-latest
~google — t/s
in
out
143
Poolside: Laguna XS.2 (free)
poolside/laguna-xs.2:free
Poolside — t/s
in
out
144
Poolside: Laguna M.1 (free)
poolside/laguna-m.1:free
Poolside — t/s
in
out
145
OpenAI GPT Mini Latest
~openai/gpt-mini-latest
~openai — t/s
in
out
146
Anthropic Claude Sonnet Latest
~anthropic/claude-sonnet-latest
~anthropic — t/s
in
out
147
OpenAI GPT Latest
~openai/gpt-latest
~openai — t/s
in
out
148
Google Gemini Pro Latest
~google/gemini-pro-latest
~google — t/s
in
out
149
Anthropic Claude Haiku Latest
~anthropic/claude-haiku-latest
~anthropic — t/s
in
out
150
Xiaomi: MiMo-V2.5
xiaomi/mimo-v2.5
Xiaomi — t/s
in
out
151
OpenAI: GPT-5.4 Image 2
openai/gpt-5.4-image-2
OpenAI — t/s
in
out
152
Baidu: Qianfan-OCR-Fast
baidu/qianfan-ocr-fast
Baidu — t/s
in
out
153
Qwen: Qwen3.6 35B A3B
qwen/qwen3.6-35b-a3b
Alibaba / Qwen — t/s
in
out
154
Qwen: Qwen3.6 Max Preview
qwen/qwen3.6-max-preview
Alibaba / Qwen — t/s
in
out
155
Baidu: Qianfan-OCR-Fast (free)
baidu/qianfan-ocr-fast:free
Baidu — t/s
in
out
156
Qwen: Qwen3.6 27B
qwen/qwen3.6-27b
Alibaba / Qwen — t/s
in
out
157
Tencent: Hy3 preview
tencent/hy3-preview
Tencent — t/s
in
out
158
Arcee AI: Trinity Large Thinking (free)
arcee-ai/trinity-large-thinking:free
Arcee-ai — t/s
in
out
159
Anthropic: Claude Opus Latest
~anthropic/claude-opus-latest
~anthropic — t/s
in
out
160
Qwen: Qwen3.6 Plus (free)
qwen/qwen3.6-plus:free
Alibaba / Qwen — t/s
in
out
161
Anthropic: Claude Opus 4.6 (Fast)
anthropic/claude-opus-4.6-fast
Anthropic — t/s
in
out
162
Pareto Code Router
openrouter/pareto-code
Openrouter — t/s
in
out
163
MoonshotAI: Kimi K2.6 (free)
moonshotai/kimi-k2.6:free
Moonshot AI — t/s
in
out
164
Google: Gemma 4 26B A4B (free)
google/gemma-4-26b-a4b-it:free
Google — t/s
in
out
165
MoonshotAI: Kimi K2.6
moonshotai/kimi-k2.6
Moonshot AI — t/s
in
out
166
Z.ai: GLM 5.1
z-ai/glm-5.1
Z-ai — t/s
in
out
167
Google: Gemma 4 26B A4B
google/gemma-4-26b-a4b-it
Google — t/s
in
out
168
Google: Gemma 4 31B (free)
google/gemma-4-31b-it:free
Google — t/s
in
out
169
Qwen: Qwen3.5 Plus 2026-02-15
qwen/qwen3.5-plus-02-15
Alibaba / Qwen — t/s
in
out
170
Arcee AI: Trinity Large Thinking
arcee-ai/trinity-large-thinking
Arcee-ai — t/s
in
out
171
Qwen: Qwen3.5 397B A17B
qwen/qwen3.5-397b-a17b
Alibaba / Qwen — t/s
in
out
172
Google: Gemma 4 31B
google/gemma-4-31b-it
Google — t/s
in
out
173
Z.ai: GLM 5V Turbo
z-ai/glm-5v-turbo
Z-ai — t/s
in
out
174
MiniMax: MiniMax M2.5 (free)
minimax/minimax-m2.5:free
MiniMax — t/s
in
out
175
Arcee AI: Trinity Large Preview
arcee-ai/trinity-large-preview
Arcee-ai — t/s
in
out
176
AllenAI: Olmo 3.1 32B Instruct
allenai/olmo-3.1-32b-instruct
Allenai — t/s
in
out
177
Qwen: Qwen3.6 Plus
qwen/qwen3.6-plus
Alibaba / Qwen — t/s
in
out
178
Qwen: Qwen3 235B A22B
qwen/qwen3-235b-a22b
Alibaba / Qwen — t/s
in
out
179
OpenAI: o4 Mini High
openai/o4-mini-high
OpenAI — t/s
in
out
180
xAI: Grok 4.20 Multi-Agent
x-ai/grok-4.20-multi-agent
xAI — t/s
in
out
181
Deep Cogito: Cogito v2.1 671B
deepcogito/cogito-v2.1-671b
Deepcogito — t/s
in
out
182
MoonshotAI: Kimi K2.5
moonshotai/kimi-k2.5
Moonshot AI — t/s
in
out
183
Z.ai: GLM 4.7
z-ai/glm-4.7
Z-ai — t/s
in
out
184
Google: Gemini 3 Flash Preview
google/gemini-3-flash-preview
Google — t/s
in
out
185
Z.ai: GLM 4.7 Flash
z-ai/glm-4.7-flash
Z-ai — t/s
in
out
186
ByteDance Seed: Seed 1.6 Flash
bytedance-seed/seed-1.6-flash
Bytedance-seed — t/s
in
out
187
Google: Gemma 3 27B
google/gemma-3-27b-it
Google — t/s
in
out
188
Meituan: LongCat Flash Chat
meituan/longcat-flash-chat
Meituan — t/s
in
out
189
Google: Lyria 3 Pro Preview
google/lyria-3-pro-preview
Google — t/s
in
out
190
Inception: Mercury
inception/mercury
Inception — t/s
in
out
191
AllenAI: Olmo 3 32B Think
allenai/olmo-3-32b-think
Allenai — t/s
in
out
192
Google: Nano Banana Pro (Gemini 3 Pro Image Preview)
google/gemini-3-pro-image-preview
Google — t/s
in
out
193
Google: Gemma 3n 4B
google/gemma-3n-e4b-it
Google — t/s
in
out
194
Google: Gemma 3 4B (free)
google/gemma-3-4b-it:free
Google — t/s
in
out
195
Microsoft: Phi 4 Mini Instruct
microsoft/phi-4-mini-instruct
Microsoft — t/s
in
out
196
Venice: Uncensored (free)
cognitivecomputations/dolphin-mistral-24b-venice-edition:free
Cognitivecomputations — t/s
in
out
197
Qwen: Qwen3 VL 235B A22B Instruct
qwen/qwen3-vl-235b-a22b-instruct
Alibaba / Qwen — t/s
in
out
198
Mistral: Mistral Small 3.2 24B
mistralai/mistral-small-3.2-24b-instruct
Mistral — t/s
in
out
199
Google: Lyria 3 Clip Preview
google/lyria-3-clip-preview
Google — t/s
in
out
200
Kwaipilot: KAT-Coder-Pro V2
kwaipilot/kat-coder-pro-v2
Kwaipilot — t/s
in
out
🧮 Tool

Pricing-Calculator

Schätze deine monatlichen API-Kosten — gib dein erwartetes Token-Volumen ein, wir rechnen für die Top-15 Modelle.

#ModellVendorQualityKosten/Monat (USD)
📈 Tool

Quality vs. Preis (Pareto-Chart)

Wo liegt der beste Tradeoff? Modelle oben-links sind die Pareto-Optima: hohe Quality, niedriger Preis.

Pareto-Optimum Andere Modelle

❓ Häufige Fragen zum KI-Modelle-Leaderboard

Woher stammen die Leaderboard-Daten?
Quality Index, Speed (Output-Tokens/s), Latenz (Time-to-First-Token) und Preise (USD pro 1 Mio. Tokens) kommen direkt von Artificial Analysis. Synchronisation täglich um 04:00 UTC. Wir speichern keine eigenen Benchmark-Werte und führen keine eigene Bewertung durch.
Was misst der Quality Index genau?
Der Artificial-Analysis-Intelligence-Index ist ein zusammengesetzter Score aus über zehn unabhängigen Benchmarks: MMLU-Pro (Wissen), GPQA & HLE (Reasoning), LiveCodeBench & SciCode (Coding), AIME (Mathematik), IFBench (Instruction-Following), LCR (Long-Context-Recall) und τ² (Tool-Use). Skala 0–100 — höher = besser.
Warum sind manche bekannte Modelle nicht im Leaderboard?
Wir zeigen nur Modelle, die in der Registry als is_active=true markiert sind und für die Artificial Analysis vollständige Benchmark-Daten liefert. Reine Bild- oder Audio-Modelle, deprecatete Versionen sowie Closed-Beta-Modelle ohne öffentliche API erscheinen nicht.
Wie nutze ich Sortierung und Filter sinnvoll?
Sortiere nach Quality für maximale Genauigkeit, nach Speed für hohen Durchsatz, nach Latency für reaktive Streaming-UIs und nach Preis für kostensensitive Workloads. Kombiniere die Vendor- und Preis-Filter, um z. B. „nur OpenAI unter 5 USD/Mio. Output“ zu finden.
Wie berechne ich konkrete Token-Kosten?
Im Token-Kostenrechner kannst du Input- und Output-Volumen einsetzen und die monatlichen Kosten für jedes Modell live durchspielen. Für direkten Head-to-Head-Vergleich zweier Modelle nutze die Vergleichs-Seiten.

Methodik. Der Quality Index ist der Artificial-Analysis-Intelligence-Index (kombiniertes Ranking aus MMLU-Pro, GPQA, HLE, LiveCodeBench, SciCode, AIME, IFBench, LCR und τ²). Speed und Latency sind Median-Werte über alle Provider, Preise pro 1 Mio Tokens (Input/Output). Synchronisation täglich um 04:00 UTC.

PromptLoop ist nicht mit Artificial Analysis verbunden. Alle Marken sind Eigentum ihrer jeweiligen Inhaber.

📬 KI-News direkt ins Postfach