Pricing

Per-model API rates synced with our catalog—reclaimed silicon, vLLM throughput, and OpenAI-compatible routes. Subscription plans and top-ups bill in platform credits below.

Generative AI

ModelProviderInput / 1M tokOutput / 1M tokPages / 1KEndpoint
Qwen3.5 122B A10BFLAGSHIP
Alibaba Cloud$0.90/1M$3.50/1M/v1/chat/completions
Qwen3.6 27B
Alibaba Cloud$0.35/1M$1.25/1M/v1/chat/completions
Qwen3.6 35B A3B
Alibaba Cloud$0.25/1M$0.95/1M/v1/chat/completions
Gemma4 31B
Google DeepMind$0.45/1M$1.50/1M/v1/chat/completions
Qwen3.5 9B
Alibaba Cloud$0.10/1M$0.35/1M/v1/chat/completions
Nemotron 3 Nano Omni 30B A3B
NVIDIA$0.22/1M$0.85/1M/v1/chat/completions
Gemma4 26B A4B
Google DeepMind$0.20/1M$0.75/1M/v1/chat/completions

OCR

ModelProviderInput / 1M tokOutput / 1M tokPages / 1KEndpoint
Chandra OCRFEATURED
Cobble Labs$1.25/v1/ocr
GLM-OCR
Zhipu AI$1.50/v1/ocr
DeepSeek OCR2
DeepSeek$1.80/v1/ocr
Nemotron OCR v2
NVIDIA$2.00/v1/ocr

Embeddings

ModelProviderInput / 1M tokOutput / 1M tokPages / 1KEndpoint
Nomic Embed 1.5FEATURED
Nomic AI$0.025/1M/v1/embeddings
Granite Embedding 311M
IBM$0.030/1M/v1/embeddings
Granite Embedding 97M
IBM$0.015/1M/v1/embeddings
Qwen3 Embedding 0.6B
Alibaba Cloud$0.020/1M/v1/embeddings
Qwen3 Embedding 8B
Alibaba Cloud$0.06/1M/v1/embeddings

Subscriptions & credits

Plans include monthly credits and rate limits. Top-ups never expire.

Loading plans…

Credit top-up

Purchase additional credits anytime. Applied immediately; they don't expire.

FAQ

How do monthly credits work?

Each plan includes a monthly credit allowance that renews at your billing cycle end. Credits are added when your subscription renews and expire at the end of that billing period.

What happens to unused credits?

Unused monthly credits expire at the end of your billing cycle and do not carry over. One-time top-up credits never expire.

How is inference cost calculated?

Per-model token, page, and endpoint rates are listed in the tables above. Failed requests are not charged where metering supports it.

Can I change plans?

Yes. Upgrade or downgrade through billing settings in your account. Changes typically take effect on the next cycle per Stripe rules.

What is the concurrent request limit?

It caps how many API requests can run in parallel per key. Higher tiers allow more throughput for bursty workloads.

Questions? Documentation · Platform