Sustainable AI Inference

Cobble

Cobble is NOT a datacenter

AI inference that doesn't cost the earth. Built from reclaimed GPUs and server hardware — extending useful life instead of manufacturing new silicon. Deployed in regions with access to renewable energy.

Curated catalog · Quantized & benchmarked

Models Available From Cobble

Generative AI

Qwen3.5 122B A10B

FLAGSHIP

Generative AI

Qwen3.6 27B

Generative AI

Qwen3.6 35B A3B

Generative AI

Gemma4 31B

Generative AI

Qwen3.5 9B

Generative AI

Gemma4 26B A4B

Generative AI

Gemma4 12B

Generative AI

Gemma4 E4B

Generative AI

Gemma4 E2B

Generative AI

Ministral 8B

Generative AI

Ministral 3B

Generative AI

Mistral Nemo 12B

Generative AI

GPT-OSS 20B

Generative AI

DeepSeek V4 Flash

COMING SOON

Generative AI

Ornith 9B

COMING SOON

Generative AI

Ornith 1 35B

COMING SOON

Generative AI

Ornith 1 31B

COMING SOON

OCR

Chandra OCR

FEATURED

OCR

GLM-OCR

OCR

DeepSeek OCR2

OCR

Nemotron OCR v2

Embeddings

Nomic Embed 1.5

FEATURED

Embeddings

Granite Embedding 311M

Embeddings

Granite Embedding 97M

Embeddings

Qwen3 Embedding 0.6B

Embeddings

Qwen3 Embedding 8B

Generative AI

Qwen3.5 122B A10B

FLAGSHIP

Generative AI

Qwen3.6 27B

Generative AI

Qwen3.6 35B A3B

Generative AI

Gemma4 31B

Generative AI

Qwen3.5 9B

Generative AI

Gemma4 26B A4B

Generative AI

Gemma4 12B

Generative AI

Gemma4 E4B

Generative AI

Gemma4 E2B

Generative AI

Ministral 8B

Generative AI

Ministral 3B

Generative AI

Mistral Nemo 12B

Generative AI

GPT-OSS 20B

Generative AI

DeepSeek V4 Flash

COMING SOON

Generative AI

Ornith 9B

COMING SOON

Generative AI

Ornith 1 35B

COMING SOON

Generative AI

Ornith 1 31B

COMING SOON

OCR

Chandra OCR

FEATURED

OCR

GLM-OCR

OCR

DeepSeek OCR2

OCR

Nemotron OCR v2

Embeddings

Nomic Embed 1.5

FEATURED

Embeddings

Granite Embedding 311M

Embeddings

Granite Embedding 97M

Embeddings

Qwen3 Embedding 0.6B

Embeddings

Qwen3 Embedding 8B

The Cobble difference

Not your typical inference provider

Traditional AI providers burn megawatts in massive datacenters. We built something different.

Traditional

Cobble

Infrastructure

Massive datacenters

Distributed edge nodes

Power Source

Grid-dependent megawatts

Renewable-ready regions

Cooling

Evaporative water cooling

No evaporative cooling

Hardware

Proprietary enterprise GPUs

Reclaimed GPUs & servers

Carbon Footprint

High manufacturing churn

No new-silicon manufacturing

Model Focus

Full precision only

Per-model quantization

Receipts, not promises

Built to do better

Every component was sourced, recycled, and repurposed.

Water Usage

Green Energy

Recycled Hardware

-0x

Carbon Footprint

Open numbers · Open weights · Open methodology

Quantized. Benchmarked. Real.

We publish what others hide. Every model is tested, quantized, and documented.

Qwen3.5 122B A10B

FP8

Context

256K tokens

Throughput

65 tokens/sec

Chandra OCR

FP8

Context

Up to 500 pages per batch

Throughput

See catalog

Nomic Embed 1.5

FP8

Context

8K tokens

Throughput

See catalog

Three steps to inference

How it works

Choose a model

Pick from our curated selection of quantized models optimized for speed and quality on recycled hardware.

Send a request

Use our OpenAI-compatible API. Drop-in replacement for your existing inference pipeline.

Get results

OpenAI-compatible responses from distributed edge nodes running vLLM.

Ready to route smarter?

Join the inference network that gives back to the planet.