Available models
| Model | Provider | Parameter | LMArena Score | Latency per 10,000 tokens |
|---|
| Claude Opus 4.6 | Anthropic | claude-opus-4-6 | 1498 | 7.4s |
| Claude Opus 4.7 | Anthropic | claude-opus-4-7 | 1491 | TBD |
| Gemini 3.5 Flash | Google | gemini-3.5-flash | 1480 | TBD |
| GPT-5.5 | OpenAI | gpt-5.5 | 1475 | TBD |
| Gemini 3 Flash Preview | Google | gemini-3-flash-preview | 1474 | 4.2s |
| Claude Opus 4.5 | Anthropic | claude-opus-4-5-20251101 | 1468 | 3.9s |
| Claude Sonnet 4.6 | Anthropic | claude-sonnet-4-6 | 1466 | 7.2s |
| Claude 4.5 Sonnet | Anthropic | claude-sonnet-4-5-20250929 | 1453 | 5.6s |
| Gemini 2.5 Pro | Google | gemini-2.5-pro | 1448 | 4.0s |
| GPT-5.1 | OpenAI | gpt-5.1 | 1439 | 2.7s |
| Gemini 3.1 Flash Lite Preview | Google | gemini-3.1-flash-lite-preview | 1438 | TBD |
| GPT-5.2 | OpenAI | gpt-5.2 | 1437 | 1.6s |
| GPT-5 | OpenAI | gpt-5 | 1434 | 4.3s |
| Kimi K2.5 | Moonshot AI | kimi-k2.5 | 1432 | 1.2s |
| GPT-4.1 | OpenAI | gpt-4.1 | 1413 | 1.8s |
| Claude 4 Opus | Anthropic | claude-opus-4-20250514 | 1412 | 13.6s |
| Gemini 2.5 Flash | Google | gemini-2.5-flash | 1411 | 2.6s |
| Claude 4.5 Haiku | Anthropic | claude-haiku-4-5-20251001 | 1409 | 4.1s |
| Qwen3 Next 80B A3B | Alibaba Cloud | qwen3-next-80b-a3b | 1402 | 3.1s |
| GPT-5 mini | OpenAI | gpt-5-mini | 1390 | 3.8s |
| Claude 4 Sonnet | Anthropic | claude-sonnet-4-20250514 | 1389 | 5.1s |
| Gemini 2.5 Flash-Lite | Google | gemini-2.5-flash-lite | 1380 | 1.1s |
| gpt-oss-120b | OpenAI | gpt-oss-120b | 1353 | 1.4s |
| Qwen3 32B | Alibaba Cloud | qwen3-32B | 1347 | 3.7s |
| GPT-5 nano | OpenAI | gpt-5-nano | 1337 | 3.2s |
| gpt-oss-20b | OpenAI | gpt-oss-20b | 1317 | 1.1s |
By latency (per 10,000 tokens)
| Model | Provider | Parameter | Latency per 10,000 tokens | LMArena Score |
|---|
| Gemini 2.5 Flash-Lite | Google | gemini-2.5-flash-lite | 1.1s | 1380 |
| gpt-oss-20b | OpenAI | gpt-oss-20b | 1.1s | 1317 |
| Kimi K2.5 | Moonshot AI | kimi-k2.5 | 1.2s | 1432 |
| gpt-oss-120b | OpenAI | gpt-oss-120b | 1.4s | 1353 |
| GPT-5.2 | OpenAI | gpt-5.2 | 1.6s | 1437 |
| GPT-4.1 | OpenAI | gpt-4.1 | 1.8s | 1413 |
| Gemini 2.5 Flash | Google | gemini-2.5-flash | 2.6s | 1411 |
| GPT-5.1 | OpenAI | gpt-5.1 | 2.7s | 1439 |
| Qwen3 Next 80B A3B | Alibaba Cloud | qwen3-next-80b-a3b | 3.1s | 1402 |
| GPT-5 nano | OpenAI | gpt-5-nano | 3.2s | 1337 |
| Qwen3 32B | Alibaba Cloud | qwen3-32B | 3.7s | 1347 |
| GPT-5 mini | OpenAI | gpt-5-mini | 3.8s | 1390 |
| Claude Opus 4.5 | Anthropic | claude-opus-4-5-20251101 | 3.9s | 1468 |
| Gemini 2.5 Pro | Google | gemini-2.5-pro | 4.0s | 1448 |
| Claude 4.5 Haiku | Anthropic | claude-haiku-4-5-20251001 | 4.1s | 1409 |
| Gemini 3 Flash Preview | Google | gemini-3-flash-preview | 4.2s | 1474 |
| GPT-5 | OpenAI | gpt-5 | 4.3s | 1434 |
| Claude 4.5 Sonnet | Anthropic | claude-sonnet-4-5-20250929 | 5.6s | 1453 |
| Claude Sonnet 4.6 | Anthropic | claude-sonnet-4-6 | 7.2s | 1466 |
| Claude Opus 4.6 | Anthropic | claude-opus-4-6 | 7.4s | 1498 |
| Claude 4 Opus | Anthropic | claude-opus-4-20250514 | 13.6s | 1412 |
| Claude Opus 4.7 | Anthropic | claude-opus-4-7 | TBD | 1491 |
| GPT-5.5 | OpenAI | gpt-5.5 | TBD | 1475 |
| Gemini 3.1 Flash Lite Preview | Google | gemini-3.1-flash-lite-preview | TBD | 1438 |
| Gemini 3.5 Flash | Google | gemini-3.5-flash | TBD | 1480 |
By provider
Anthropic Claude
| Model | Parameter | LMArena Score | Latency per 10,000 tokens |
|---|
| Claude Opus 4.7 | claude-opus-4-7 | 1491 | TBD |
| Claude Opus 4.6 | claude-opus-4-6 | 1498 | 7.4s |
| Claude Sonnet 4.6 | claude-sonnet-4-6 | 1466 | 7.2s |
| Claude Opus 4.5 | claude-opus-4-5-20251101 | 1468 | 3.9s |
| Claude 4.5 Sonnet | claude-sonnet-4-5-20250929 | 1453 | 5.6s |
| Claude 4.5 Haiku | claude-haiku-4-5-20251001 | 1409 | 4.1s |
| Claude 4 Opus | claude-opus-4-20250514 | 1412 | 13.6s |
OpenAI GPT
| Model | Parameter | LMArena Score | Latency per 10,000 tokens |
|---|
| GPT-5.5 | gpt-5.5 | 1475 | TBD |
| GPT-5.2 | gpt-5.2 | 1437 | 1.6s |
| GPT-5.1 | gpt-5.1 | 1439 | 2.7s |
| GPT-5 | gpt-5 | 1434 | 4.3s |
| GPT-5 nano | gpt-5-nano | 1337 | 3.2s |
| GPT-5 mini | gpt-5-mini | 1390 | 3.8s |
| GPT-4.1 | gpt-4.1 | 1413 | 1.8s |
| gpt-oss-120b | gpt-oss-120b | 1353 | 1.4s |
| gpt-oss-20b | gpt-oss-20b | 1317 | 1.1s |
Google Gemini
| Model | Parameter | LMArena Score | Latency per 10,000 tokens |
|---|
| Gemini 3.5 Flash | gemini-3.5-flash | 1480 | TBD |
| Gemini 3 Flash Preview | gemini-3-flash-preview | 1474 | 4.2s |
| Gemini 3.1 Flash Lite Preview | gemini-3.1-flash-lite-preview | 1438 | TBD |
| Gemini 2.5 Pro | gemini-2.5-pro | 1448 | 4.0s |
| Gemini 2.5 Flash | gemini-2.5-flash | 1411 | 2.6s |
| Gemini 2.5 Flash-Lite | gemini-2.5-flash-lite | 1380 | 1.1s |
Gemini 3.1 Flash Lite Preview is currently available in the US region only.
Alibaba Cloud Qwen
| Model | Parameter | LMArena Score | Latency per 10,000 tokens |
|---|
| Qwen3 Next 80B A3B | qwen3-next-80b-a3b | 1402 | 3.1s |
| Qwen3 32B | qwen3-32B | 1347 | 3.7s |
Moonshot AI Kimi
| Model | Parameter | LMArena Score | Latency per 10,000 tokens |
|---|
| Kimi K2.5 | kimi-k2.5 | 1432 | 1.2s |
Claude Opus 4.5 and Claude Opus 4.6 currently support context windows under
200k tokens via the LLM Gateway.
Head to our Playground to
test out LLM Gateway without having to write any code!