📊 Live Benchmark

KI-Modelle Leaderboard 2026

Alle relevanten Large Language Models auf einen Blick — sortiert nach Quality Index, Geschwindigkeit, Latenz und Preis. Datenquelle: Artificial Analysis.

200 aktive Modelle aus 12 Vendor-Familien · Letzte Synchronisation: 24. Juli 2026

Wie liest du dieses Leaderboard?

Quality Index ist ein zusammengesetzter Wert von Artificial Analysis aus über zehn unabhängigen Benchmarks (MMLU-Pro, GPQA, HLE, LiveCodeBench, SciCode, AIME u. a.). Höher = besser. Speed misst Output-Tokens pro Sekunde im Median über alle Hosting-Provider. Latency ist die Time-to-First-Token in Sekunden — wichtig für Streaming-UIs. Preise sind US-Dollar pro 1 Million Tokens, separat für Input und Output. Sortiere nach deiner Priorität, filter nach Anbieter oder Preis-Bucket — und prüfe konkrete Kosten direkt im Token-Rechner oder zwei Modelle im Head-to-Head-Vergleich.

#	Modell	Vendor	Quality	Speed	Latency	Preis (USD/1M)
1	Anthropic: Claude Fable 5 anthropic/claude-fable-5	Anthropic	59,9	60,2 t/s	73,24 s	$10.00 in $50.00 out
2	OpenAI: GPT-5.6 Sol openai/gpt-5.6-sol	OpenAI	57,7	60,5 t/s	31,00 s	$5.00 in $30.00 out
3	Anthropic: Claude Opus 4.8 anthropic/claude-opus-4.8	Anthropic	55,7	65,9 t/s	48,25 s	$5.00 in $25.00 out
4	xAI: Grok 4.5 x-ai/grok-4.5	xAI	53,8	66,2 t/s	10,89 s	$2.00 in $6.00 out
5	Anthropic: Claude Opus 4.7 anthropic/claude-opus-4.7	Anthropic	53,5	60,7 t/s	10,63 s	$5.00 in $25.00 out
6	Meta: Muse Spark 1.1 meta/muse-spark-1.1	Meta	50,6	124,4 t/s	1,01 s	$1.25 in $4.25 out
7	OpenAI: GPT-5.5 openai/gpt-5.5	OpenAI	50,4	78,1 t/s	7,12 s	$5.00 in $30.00 out
8	Google: Gemini 3.5 Flash google/gemini-3.5-flash	Google	50,2	258,9 t/s	14,96 s	$1.50 in $9.00 out
9	Google: Gemini 3.6 Flash google/gemini-3.6-flash	Google	50,1	257,5 t/s	17,29 s	$1.50 in $7.50 out
10	Google: Gemini 3.1 Pro Preview google/gemini-3.1-pro-preview	Google	46,5	130,5 t/s	30,30 s	$2.00 in $12.00 out
11	OpenAI: GPT-5.6 Luna openai/gpt-5.6-luna	OpenAI	46,1	176,9 t/s	5,77 s	$1.00 in $6.00 out
12	MiniMax: MiniMax M3 minimax/minimax-m3	MiniMax	44,4	90,3 t/s	1,35 s	$0.30 in $1.20 out
13	OpenAI: GPT-5.3-Codex openai/gpt-5.3-codex	OpenAI	44,3	124,1 t/s	52,81 s	$1.75 in $14.00 out
14	DeepSeek: DeepSeek V4 Pro deepseek/deepseek-v4-pro	DeepSeek	43,1	70,4 t/s	970 ms	$0.435 in $0.87 out
15	OpenAI: GPT-5.2 openai/gpt-5.2	OpenAI	42,2	78,0 t/s	125,29 s	$1.75 in $14.00 out
16	Anthropic: Claude Sonnet 5 anthropic/claude-sonnet-5	Anthropic	41,7	64,9 t/s	5,79 s	$2.00 in $10.00 out
17	Tencent: Hy3 tencent/hy3	Tencent	41,2	65,3 t/s	1,87 s	$0.14 in $0.58 out
18	OpenAI: GPT-5.6 Terra openai/gpt-5.6-terra	OpenAI	40,5	105,9 t/s	1,49 s	$2.50 in $15.00 out
19	Xiaomi: MiMo-V2-Pro xiaomi/mimo-v2-pro	Xiaomi	40,3	0 t/s	0 ms	$0 in $0 out
20	OpenAI: GPT-5.2-Codex openai/gpt-5.2-codex	OpenAI	40,1	148,4 t/s	7,88 s	$1.75 in $14.00 out
21	MiniMax: MiniMax M2.7 minimax/minimax-m2.7	MiniMax	38,1	59,8 t/s	1,05 s	$0.30 in $1.20 out
22	Anthropic: Claude Opus 4.6 anthropic/claude-opus-4.6	Anthropic	37,8	69,7 t/s	3,52 s	$5.00 in $25.00 out
23	NVIDIA: Nemotron 3 Ultra nvidia/nemotron-3-ultra-550b-a55b	NVIDIA	37,8	180,4 t/s	889 ms	$0.675 in $2.675 out
24	xAI: Grok 4.3 x-ai/grok-4.3	xAI	37,6	113,2 t/s	26,59 s	$1.25 in $2.50 out
25	xAI: Grok 4.20 x-ai/grok-4.20	xAI	37	215,3 t/s	20,20 s	$2.00 in $6.00 out
26	OpenAI: GPT-5.1 openai/gpt-5.1	OpenAI	36,9	119,2 t/s	40,40 s	$1.25 in $10.00 out
27	Google: Gemini 3.5 Flash Lite google/gemini-3.5-flash-lite	Google	36,5	362,2 t/s	8,99 s	$0.30 in $2.50 out
28	OpenAI: GPT-5 Codex openai/gpt-5-codex	OpenAI	36,1	180,4 t/s	8,14 s	$1.25 in $10.00 out
29	Anthropic: Claude Sonnet 4.6 anthropic/claude-sonnet-4.6	Anthropic	35,9	66,1 t/s	2,10 s	$3.00 in $15.00 out
30	Xiaomi: MiMo-V2-Omni xiaomi/mimo-v2-omni	Xiaomi	35	0 t/s	0 ms	$0 in $0 out
31	OpenAI: GPT-5.1-Codex openai/gpt-5.1-codex	OpenAI	34,7	223,6 t/s	3,53 s	$1.25 in $10.00 out
32	OpenAI: GPT-5 openai/gpt-5	OpenAI	34,7	116,7 t/s	150,51 s	$1.25 in $10.00 out
33	Anthropic: Claude Opus 4.5 anthropic/claude-opus-4.5	Anthropic	34,7	82,8 t/s	1,77 s	$5.00 in $25.00 out
34	MiniMax: MiniMax M2.5 minimax/minimax-m2.5	MiniMax	33,7	91,1 t/s	931 ms	$0.30 in $1.20 out
35	Tencent: Hy3 preview tencent/hy3-preview	Tencent	33,6	117,8 t/s	1,67 s	$0.123 in $0.43 out
36	xAI: Grok 4 x-ai/grok-4	xAI	33,3	0 t/s	0 ms	$5.50 in $27.50 out
37	OpenAI: o3 Pro openai/o3-pro	OpenAI	32,5	0 t/s	0 ms	$20.00 in $80.00 out
38	DeepSeek: DeepSeek V3.2 deepseek/deepseek-v3.2	DeepSeek	32	0 t/s	0 ms	$0.28 in $0.42 out
39	MiniMax: MiniMax M2.1 minimax/minimax-m2.1	MiniMax	31,4	100,3 t/s	928 ms	$0.30 in $1.20 out
40	Xiaomi: MiMo-V2-Flash xiaomi/mimo-v2-flash	Xiaomi	31,2	0 t/s	0 ms	$0.10 in $0.30 out
41	OpenAI: GPT-5 Mini openai/gpt-5-mini	OpenAI	30,9	80,7 t/s	22,04 s	$0.25 in $2.00 out
42	inclusionAI: Ring-2.6-1T inclusionai/ring-2.6-1t	Inclusionai	30,6	127,2 t/s	1,99 s	$0.30 in $2.50 out
43	OpenAI: GPT-5.1-Codex-Mini openai/gpt-5.1-codex-mini	OpenAI	30,6	205,2 t/s	3,42 s	$0.25 in $2.00 out
44	DeepSeek: DeepSeek V3.1 Terminus deepseek/deepseek-v3.1-terminus	DeepSeek	30,4	0 t/s	0 ms	$1.635 in $2.75 out
45	OpenAI: o3 openai/o3	OpenAI	30,4	127,7 t/s	7,98 s	$2.00 in $8.00 out
46	StepFun: Step 3.7 Flash stepfun/step-3.7-flash	Stepfun	30,3	399,5 t/s	608 ms	$0.20 in $1.15 out
47	OpenAI: GPT-5.4 Nano openai/gpt-5.4-nano	OpenAI	30,2	127,9 t/s	5,10 s	$0.20 in $1.25 out
48	Mistral: Mistral Medium 3.5 mistralai/mistral-medium-3-5	Mistral	29,9	86,3 t/s	760 ms	$1.50 in $7.50 out
49	OpenAI: GPT-5.4 Mini openai/gpt-5.4-mini	OpenAI	29,8	187,2 t/s	13,36 s	$0.75 in $4.50 out
50	Anthropic: Claude Sonnet 4.5 anthropic/claude-sonnet-4.5	Anthropic	29,3	76,1 t/s	1,34 s	$3.00 in $15.00 out
51	DeepSeek: DeepSeek V4 Flash deepseek/deepseek-v4-flash	DeepSeek	28,7	122,5 t/s	908 ms	$0.14 in $0.28 out
52	MiniMax: MiniMax M2 minimax/minimax-m2	MiniMax	28,3	114,0 t/s	949 ms	$0.30 in $1.20 out
53	Anthropic: Claude Opus 4.1 anthropic/claude-opus-4.1	Anthropic	28,2	0 t/s	0 ms	$15.00 in $75.00 out
54	Xiaomi: MiMo-V2.5-Pro xiaomi/mimo-v2.5-pro	Xiaomi	27,9	63,0 t/s	1,55 s	$0.661 in $1.722 out
55	OpenAI: GPT-5.4 openai/gpt-5.4	OpenAI	27,7	103,4 t/s	706 ms	$2.50 in $15.00 out
56	inclusionAI: Ling-2.6-1T inclusionai/ling-2.6-1t	Inclusionai	26,1	0 t/s	0 ms	$0.30 in $2.50 out
57	StepFun: Step 3.5 Flash stepfun/step-3.5-flash	Stepfun	26	296,1 t/s	732 ms	$0.10 in $0.30 out
58	Google: Gemini 2.5 Pro google/gemini-2.5-pro	Google	25,8	138,9 t/s	22,51 s	$1.25 in $10.00 out
59	OpenAI: o4 Mini openai/o4-mini	OpenAI	25,6	116,6 t/s	18,78 s	$1.10 in $4.40 out
60	Anthropic: Claude Opus 4 anthropic/claude-opus-4	Anthropic	25,5	0 t/s	0 ms	$15.00 in $75.00 out
61	Anthropic: Claude Sonnet 4 anthropic/claude-sonnet-4	Anthropic	25,5	0 t/s	0 ms	$3.00 in $15.00 out
62	NVIDIA: Nemotron 3 Super nvidia/nemotron-3-super-120b-a12b	NVIDIA	25,4	152,4 t/s	1,01 s	$0.25 in $0.775 out
63	Google: Gemini 3.1 Flash Lite Preview google/gemini-3.1-flash-lite-preview	Google	25	311,0 t/s	5,03 s	$0.25 in $1.50 out
64	Anthropic: Claude Haiku 4.5 anthropic/claude-haiku-4.5	Anthropic	23,7	101,3 t/s	733 ms	$1.00 in $5.00 out
65	Anthropic: Claude 3.7 Sonnet anthropic/claude-3.7-sonnet	Anthropic	23,5	0 t/s	0 ms	$3.00 in $15.00 out
66	OpenAI: o1 openai/o1	OpenAI	23,4	0 t/s	0 ms	$15.00 in $60.00 out
67	xAI: Grok 3 Mini x-ai/grok-3-mini	xAI	22,5	55,2 t/s	1,52 s	$0.30 in $0.50 out
68	DeepSeek: DeepSeek V3.2 Speciale deepseek/deepseek-v3.2-speciale	DeepSeek	22,2	0 t/s	0 ms	$0 in $0 out
69	xAI: Grok Code Fast 1 x-ai/grok-code-fast-1	xAI	21,6	0 t/s	0 ms	$0 in $0 out
70	Inception: Mercury 2 inception/mercury-2	Inception	21,4	1096,8 t/s	3,75 s	$0.25 in $0.75 out
71	Google: Gemini 2.5 Flash google/gemini-2.5-flash	Google	20,1	231,2 t/s	9,17 s	$0.30 in $2.50 out
72	DeepSeek: R1 deepseek/deepseek-r1	DeepSeek	20,1	0 t/s	0 ms	$1.35 in $4.20 out
73	OpenAI: GPT-5 Nano openai/gpt-5-nano	OpenAI	19,9	164,1 t/s	71,97 s	$0.05 in $0.40 out
74	OpenAI: GPT-4.1 openai/gpt-4.1	OpenAI	19,4	155,9 t/s	1,01 s	$2.00 in $8.00 out
75	OpenAI: o3 Mini openai/o3-mini	OpenAI	19	198,4 t/s	6,10 s	$1.10 in $4.40 out
76	OpenAI: o1-pro openai/o1-pro	OpenAI	18,9	0 t/s	0 ms	$150.00 in $600.00 out
77	Perplexity: Sonar Reasoning Pro perplexity/sonar-reasoning-pro	Perplexity	17,8	0 t/s	0 ms	$0 in $0 out
78	xAI: Grok 4.1 Fast x-ai/grok-4.1-fast	xAI	16,9	0 t/s	0 ms	$0 in $0 out
79	xAI: Grok 4 Fast x-ai/grok-4-fast	xAI	16,5	0 t/s	0 ms	$0.20 in $0.50 out
80	Prime Intellect: INTELLECT-3 prime-intellect/intellect-3	Prime-intellect	15,6	0 t/s	0 ms	$0 in $0 out
81	xAI: Grok 3 x-ai/grok-3	xAI	15,1	0 t/s	0 ms	$0 in $0 out
82	Google: Gemini 2.5 Flash Lite Preview 09-2025 google/gemini-2.5-flash-lite-preview-09-2025	Google	15,1	0 t/s	0 ms	$0.10 in $0.40 out
83	OpenAI: gpt-oss-120b openai/gpt-oss-120b	OpenAI	14,9	317,9 t/s	578 ms	$0.15 in $0.60 out
84	OpenAI: GPT-4.1 Mini openai/gpt-4.1-mini	OpenAI	14,8	87,7 t/s	505 ms	$0.40 in $1.60 out
85	Mistral: Mistral Medium 3.1 mistralai/mistral-medium-3.1	Mistral	14,7	109,3 t/s	490 ms	$0.40 in $2.00 out
86	OpenAI: gpt-oss-20b openai/gpt-oss-20b	OpenAI	14,3	249,7 t/s	463 ms	$0.07 in $0.20 out
87	Meta: Llama 4 Maverick meta-llama/llama-4-maverick	Meta	14,3	102,2 t/s	593 ms	$0.31 in $0.85 out
88	Upstage: Solar Pro 3 upstage/solar-pro-3	Upstage	14,1	0 t/s	0 ms	$0 in $0 out
89	inclusionAI: Ling-2.6-flash inclusionai/ling-2.6-flash	Inclusionai	14,1	125,9 t/s	645 ms	$0.10 in $0.30 out
90	Mistral: Mistral Medium 3 mistralai/mistral-medium-3	Mistral	12,5	49,1 t/s	462 ms	$0.40 in $2.00 out
91	Anthropic: Claude 3.5 Haiku anthropic/claude-3.5-haiku	Anthropic	12,3	0 t/s	0 ms	$0 in $0 out
92	Perplexity: Sonar perplexity/sonar	Perplexity	11,7	0 t/s	0 ms	$0 in $0 out
93	OpenAI: GPT-4o openai/gpt-4o	OpenAI	11,2	161,9 t/s	527 ms	$2.50 in $10.00 out
94	DeepSeek: R1 Distill Qwen 32B deepseek/deepseek-r1-distill-qwen-32b	DeepSeek	11	0 t/s	0 ms	$0 in $0 out
95	Meta: Llama 4 Scout meta-llama/llama-4-scout	Meta	10	86,3 t/s	593 ms	$0.18 in $0.66 out
96	DeepSeek: R1 Distill Llama 70B deepseek/deepseek-r1-distill-llama-70b	DeepSeek	9,9	36,6 t/s	524 ms	$0.70 in $1.05 out
97	OpenAI: GPT-4.1 Nano openai/gpt-4.1-nano	OpenAI	9,6	153,7 t/s	442 ms	$0.10 in $0.40 out
98	OpenAI: GPT-4o (2024-08-06) openai/gpt-4o-2024-08-06	OpenAI	9,6	75,3 t/s	1,17 s	$2.50 in $10.00 out
99	Meta: Llama 3.3 70B Instruct meta-llama/llama-3.3-70b-instruct	Meta	9,4	77,2 t/s	634 ms	$0.59 in $0.71 out
100	Mistral: Devstral Small 1.1 mistralai/devstral-small	Mistral	9,3	0 t/s	0 ms	$0 in $0 out
101	Perplexity: Sonar Pro perplexity/sonar-pro	Perplexity	9,3	0 t/s	0 ms	$0 in $0 out
102	NVIDIA: Llama 3.1 Nemotron Ultra 253B v1 nvidia/llama-3.1-nemotron-ultra-253b-v1	NVIDIA	9,1	53,3 t/s	700 ms	$0.60 in $1.80 out
103	Baidu: ERNIE 4.5 300B A47B baidu/ernie-4.5-300b-a47b	Baidu	9	0 t/s	0 ms	$0.28 in $1.10 out
104	NVIDIA: Nemotron Nano 12B 2 VL nvidia/nemotron-nano-12b-v2-vl	NVIDIA	9	50,0 t/s	3,52 s	$0.20 in $0.60 out
105	Google: Gemini 2.0 Flash Lite google/gemini-2.0-flash-lite-001	Google	8,8	0 t/s	0 ms	$0 in $0 out
106	OpenAI: GPT-4o (2024-05-13) openai/gpt-4o-2024-05-13	OpenAI	8,6	94,5 t/s	471 ms	$5.00 in $15.00 out
107	Mistral: Pixtral Large 2411 mistralai/pixtral-large-2411	Mistral	8,1	0 t/s	0 ms	$0 in $0 out
108	OpenAI: GPT-4 Turbo openai/gpt-4-turbo	OpenAI	7,9	36,1 t/s	1,25 s	$10.00 in $30.00 out
109	Cohere: Command A cohere/command-a	Cohere	7,7	55,5 t/s	515 ms	$2.50 in $10.00 out
110	NVIDIA: Llama 3.1 Nemotron 70B Instruct nvidia/llama-3.1-nemotron-70b-instruct	NVIDIA	7,6	71,2 t/s	5,17 s	$1.20 in $1.20 out
111	Meta: Llama 3.1 8B Instruct meta-llama/llama-3.1-8b-instruct	Meta	7,6	153,4 t/s	518 ms	$0.075 in $0.09 out
112	NVIDIA: Nemotron 3 Nano 30B A3B nvidia/nemotron-3-nano-30b-a3b	NVIDIA	7,4	121,6 t/s	563 ms	$0.05 in $0.20 out
113	NVIDIA: Nemotron Nano 9B V2 nvidia/nemotron-nano-9b-v2	NVIDIA	7,4	111,6 t/s	2,39 s	$0.05 in $0.195 out
114	Mistral Large 2407 mistralai/mistral-large-2407	Mistral	7,3	0 t/s	0 ms	$2.00 in $6.00 out
115	OpenAI: GPT-4 openai/gpt-4	OpenAI	7	43,5 t/s	759 ms	$30.00 in $60.00 out
116	Google: Gemini 2.5 Flash Lite google/gemini-2.5-flash-lite	Google	6,9	299,5 t/s	315 ms	$0.10 in $0.40 out
117	OpenAI: GPT-4o-mini openai/gpt-4o-mini	OpenAI	6,9	68,3 t/s	545 ms	$0.15 in $0.60 out
118	Meta: Llama 3.1 70B Instruct meta-llama/llama-3.1-70b-instruct	Meta	6,8	92,2 t/s	534 ms	$0.56 in $0.56 out
119	Mistral: Saba mistralai/mistral-saba	Mistral	6,4	0 t/s	0 ms	$0 in $0 out
120	Mistral Large mistralai/mistral-large	Mistral	4,4	0 t/s	0 ms	$4.00 in $12.00 out
121	Meta: Llama 3.2 3B Instruct meta-llama/llama-3.2-3b-instruct	Meta	4,2	0 t/s	0 ms	$0 in $0 out
122	Anthropic: Claude 3 Haiku anthropic/claude-3-haiku	Anthropic	3,9	0 t/s	0 ms	$0.25 in $1.25 out
123	Meta: Llama 3 70B Instruct meta-llama/llama-3-70b-instruct	Meta	3,5	44,9 t/s	651 ms	$0.65 in $2.75 out
124	Meta: Llama 3.2 11B Vision Instruct meta-llama/llama-3.2-11b-vision-instruct	Meta	3,3	15,7 t/s	516 ms	$0.357 in $0.357 out
125	Mistral: Mixtral 8x7B Instruct mistralai/mixtral-8x7b-instruct	Mistral	2,4	0 t/s	0 ms	$0.45 in $0.70 out
126	Meta: Llama 3 8B Instruct meta-llama/llama-3-8b-instruct	Meta	1,2	82,5 t/s	454 ms	$0.045 in $0.145 out
127	Meta: Llama 3.2 1B Instruct meta-llama/llama-3.2-1b-instruct	Meta	1,1	0 t/s	0 ms	$0 in $0 out
128	Ling-3.0-flash (free) inclusionai/ling-3.0-flash:free	Inclusionai	—	— t/s	—	— in — out
129	Elephant openrouter/elephant-alpha	Openrouter	—	— t/s	—	— in — out
130	Meituan: LongCat 2.0 meituan/longcat-2.0	Meituan	—	— t/s	—	— in — out
131	Poolside: Laguna S 2.1 poolside/laguna-s-2.1	Poolside	—	— t/s	—	— in — out
132	Thinking Machines: Inkling thinkingmachines/inkling	Thinkingmachines	—	— t/s	—	— in — out
133	Auto Router (Beta) openrouter/auto-beta	Openrouter	—	— t/s	—	— in — out
134	inclusionAI: Ring-2.6-1T (free) inclusionai/ring-2.6-1t:free	Inclusionai	—	— t/s	—	— in — out
135	Poolside: Laguna S 2.1 (free) poolside/laguna-s-2.1:free	Poolside	—	— t/s	—	— in — out
136	Nex AGI: Nex-N2-Pro (free) nex-agi/nex-n2-pro:free	Nex-agi	—	— t/s	—	— in — out
137	Baidu Qianfan: CoBuddy (free) baidu/cobuddy:free	Baidu	—	— t/s	—	— in — out
138	DeepSeek: DeepSeek V4 Flash (free) deepseek/deepseek-v4-flash:free	DeepSeek	—	— t/s	—	— in — out
139	MoonshotAI: Kimi K3 moonshotai/kimi-k3	Moonshot AI	—	— t/s	—	— in — out
140	OpenAI: GPT-5.6 Luna Pro openai/gpt-5.6-luna-pro	OpenAI	—	— t/s	—	— in — out
141	OpenAI: GPT-5.6 Terra Pro openai/gpt-5.6-terra-pro	OpenAI	—	— t/s	—	— in — out
142	xAI: Grok Latest ~x-ai/grok-latest	~x-ai	—	— t/s	—	— in — out
143	Kwaipilot: KAT-Coder-Air V2.5 kwaipilot/kat-coder-air-v2.5	Kwaipilot	—	— t/s	—	— in — out
144	Kwaipilot: KAT-Coder-Pro V2.5 kwaipilot/kat-coder-pro-v2.5	Kwaipilot	—	— t/s	—	— in — out
145	OpenAI: GPT-5.6 Sol Pro openai/gpt-5.6-sol-pro	OpenAI	—	— t/s	—	— in — out
146	xAI: Grok 4.20 Multi-Agent x-ai/grok-4.20-multi-agent	xAI	—	— t/s	—	— in — out
147	Tencent: Hy3 (free) tencent/hy3:free	Tencent	—	— t/s	—	— in — out
148	Google: Lyria 3 Pro Preview google/lyria-3-pro-preview	Google	—	— t/s	—	— in — out
149	inclusionAI: Ling-2.6-flash (free) inclusionai/ling-2.6-flash:free	Inclusionai	—	— t/s	—	— in — out
150	Owl Alpha openrouter/owl-alpha	Openrouter	—	— t/s	—	— in — out
151	AionLabs: Aion-3.0 aion-labs/aion-3.0	Aion-labs	—	— t/s	—	— in — out
152	AionLabs: Aion-3.0-Mini aion-labs/aion-3.0-mini	Aion-labs	—	— t/s	—	— in — out
153	inclusionAI: Ling-2.6-1T (free) inclusionai/ling-2.6-1t:free	Inclusionai	—	— t/s	—	— in — out
154	Tencent: Hy3 preview (free) tencent/hy3-preview:free	Tencent	—	— t/s	—	— in — out
155	Poolside: Laguna XS 2.1 poolside/laguna-xs-2.1	Poolside	—	— t/s	—	— in — out
156	Poolside: Laguna XS 2.1 (free) poolside/laguna-xs-2.1:free	Poolside	—	— t/s	—	— in — out
157	Google: Nano Banana 2 Lite (Gemini 3.1 Flash Lite Image) google/gemini-3.1-flash-lite-image	Google	—	— t/s	—	— in — out
158	Nex AGI: Nex-N2-Mini nex-agi/nex-n2-mini	Nex-agi	—	— t/s	—	— in — out
159	Sakana: Fugu Ultra sakana/fugu-ultra	Sakana	—	— t/s	—	— in — out
160	Google: Nano Banana 2 (Gemini 3.1 Flash Image) google/gemini-3.1-flash-image	Google	—	— t/s	—	— in — out
161	Google: Gemma 3 12B google/gemma-3-12b-it	Google	—	— t/s	—	— in — out
162	OpenRouter: Fusion openrouter/fusion	Openrouter	—	— t/s	—	— in — out
163	Google: Nano Banana Pro (Gemini 3 Pro Image) google/gemini-3-pro-image	Google	—	— t/s	—	— in — out
164	Z.ai: GLM 5.2 z-ai/glm-5.2	Z-ai	—	— t/s	—	— in — out
165	Baidu: Qianfan-OCR-Fast baidu/qianfan-ocr-fast	Baidu	—	— t/s	—	— in — out
166	Poolside: Laguna XS.2 (free) poolside/laguna-xs.2:free	Poolside	—	— t/s	—	— in — out
167	Poolside: Laguna XS.2 poolside/laguna-xs.2	Poolside	—	— t/s	—	— in — out
168	Cohere: North Mini Code (free) cohere/north-mini-code:free	Cohere	—	— t/s	—	— in — out
169	Baidu: Qianfan-OCR-Fast (free) baidu/qianfan-ocr-fast:free	Baidu	—	— t/s	—	— in — out
170	MoonshotAI: Kimi K2.7 Code moonshotai/kimi-k2.7-code	Moonshot AI	—	— t/s	—	— in — out
171	Anthropic: Claude Fable Latest ~anthropic/claude-fable-latest	~anthropic	—	— t/s	—	— in — out
172	Nex AGI: Nex-N2-Pro nex-agi/nex-n2-pro	Nex-agi	—	— t/s	—	— in — out
173	Arcee AI: Trinity Large Thinking (free) arcee-ai/trinity-large-thinking:free	Arcee-ai	—	— t/s	—	— in — out
174	Qwen: Qwen3.6 Plus (free) qwen/qwen3.6-plus:free	Alibaba / Qwen	—	— t/s	—	— in — out
175	Anthropic: Claude Opus 4.6 (Fast) anthropic/claude-opus-4.6-fast	Anthropic	—	— t/s	—	— in — out
176	Qwen: Qwen3.7 Plus qwen/qwen3.7-plus	Alibaba / Qwen	—	— t/s	—	— in — out
177	MoonshotAI: Kimi K2.6 (free) moonshotai/kimi-k2.6:free	Moonshot AI	—	— t/s	—	— in — out
178	Anthropic: Claude Opus 4.8 (Fast) anthropic/claude-opus-4.8-fast	Anthropic	—	— t/s	—	— in — out
179	NVIDIA: Nemotron 3.5 Content Safety (free) nvidia/nemotron-3.5-content-safety:free	NVIDIA	—	— t/s	—	— in — out
180	Qwen: Qwen3.7 Max qwen/qwen3.7-max	Alibaba / Qwen	—	— t/s	—	— in — out
181	xAI: Grok Build 0.1 x-ai/grok-build-0.1	xAI	—	— t/s	—	— in — out
182	NVIDIA: Nemotron 3 Ultra (free) nvidia/nemotron-3-ultra-550b-a55b:free	NVIDIA	—	— t/s	—	— in — out
183	Anthropic: Claude Opus 4.7 (Fast) anthropic/claude-opus-4.7-fast	Anthropic	—	— t/s	—	— in — out
184	OpenAI GPT Latest ~openai/gpt-latest	~openai	—	— t/s	—	— in — out
185	Google Gemini Flash Latest ~google/gemini-flash-latest	~google	—	— t/s	—	— in — out
186	Anthropic Claude Haiku Latest ~anthropic/claude-haiku-latest	~anthropic	—	— t/s	—	— in — out
187	Google: Gemini 3.1 Flash Lite google/gemini-3.1-flash-lite	Google	—	— t/s	—	— in — out
188	MiniMax: MiniMax M2.5 (free) minimax/minimax-m2.5:free	MiniMax	—	— t/s	—	— in — out
189	Arcee AI: Trinity Large Preview arcee-ai/trinity-large-preview	Arcee-ai	—	— t/s	—	— in — out
190	Anthropic Claude Sonnet Latest ~anthropic/claude-sonnet-latest	~anthropic	—	— t/s	—	— in — out
191	OpenAI: GPT Chat Latest openai/gpt-chat-latest	OpenAI	—	— t/s	—	— in — out
192	OpenAI GPT Mini Latest ~openai/gpt-mini-latest	~openai	—	— t/s	—	— in — out
193	AllenAI: Olmo 3.1 32B Instruct allenai/olmo-3.1-32b-instruct	Allenai	—	— t/s	—	— in — out
194	Perceptron: Perceptron Mk1 perceptron/perceptron-mk1	Perceptron	—	— t/s	—	— in — out
195	Google Gemini Pro Latest ~google/gemini-pro-latest	~google	—	— t/s	—	— in — out
196	MoonshotAI Kimi Latest ~moonshotai/kimi-latest	~moonshotai	—	— t/s	—	— in — out
197	Qwen: Qwen3.6 Max Preview qwen/qwen3.6-max-preview	Alibaba / Qwen	—	— t/s	—	— in — out
198	Qwen: Qwen3.6 27B qwen/qwen3.6-27b	Alibaba / Qwen	—	— t/s	—	— in — out
199	IBM: Granite 4.1 8B ibm-granite/granite-4.1-8b	Ibm-granite	—	— t/s	—	— in — out
200	NVIDIA: Nemotron 3 Nano Omni (free) nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free	NVIDIA	—	— t/s	—	— in — out

🧮 Tool

Pricing-Calculator

Schätze deine monatlichen API-Kosten — gib dein erwartetes Token-Volumen ein, wir rechnen für die Top-15 Modelle.

Input Tokens (Mio./Monat) Output Tokens (Mio./Monat)

#	Modell	Vendor	Quality	Kosten/Monat (USD)

📈 Tool

Quality vs. Preis (Pareto-Chart)

Wo liegt der beste Tradeoff? Modelle oben-links sind die Pareto-Optima: hohe Quality, niedriger Preis.

Pareto-Optimum Andere Modelle

❓ Häufige Fragen zum KI-Modelle-Leaderboard

Woher stammen die Leaderboard-Daten?

Quality Index, Speed (Output-Tokens/s), Latenz (Time-to-First-Token) und Preise (USD pro 1 Mio. Tokens) kommen direkt von Artificial Analysis. Synchronisation täglich um 04:00 UTC. Wir speichern keine eigenen Benchmark-Werte und führen keine eigene Bewertung durch.

Was misst der Quality Index genau?

Der Artificial-Analysis-Intelligence-Index ist ein zusammengesetzter Score aus über zehn unabhängigen Benchmarks: MMLU-Pro (Wissen), GPQA & HLE (Reasoning), LiveCodeBench & SciCode (Coding), AIME (Mathematik), IFBench (Instruction-Following), LCR (Long-Context-Recall) und τ² (Tool-Use). Skala 0–100 — höher = besser.

Warum sind manche bekannte Modelle nicht im Leaderboard?

Wir zeigen nur Modelle, die in der Registry als is_active=true markiert sind und für die Artificial Analysis vollständige Benchmark-Daten liefert. Reine Bild- oder Audio-Modelle, deprecatete Versionen sowie Closed-Beta-Modelle ohne öffentliche API erscheinen nicht.

Wie nutze ich Sortierung und Filter sinnvoll?

Sortiere nach Quality für maximale Genauigkeit, nach Speed für hohen Durchsatz, nach Latency für reaktive Streaming-UIs und nach Preis für kostensensitive Workloads. Kombiniere die Vendor- und Preis-Filter, um z. B. „nur OpenAI unter 5 USD/Mio. Output“ zu finden.

Wie berechne ich konkrete Token-Kosten?

Im Token-Kostenrechner kannst du Input- und Output-Volumen einsetzen und die monatlichen Kosten für jedes Modell live durchspielen. Für direkten Head-to-Head-Vergleich zweier Modelle nutze die Vergleichs-Seiten.