Sustainable AI Inference

Cobble

Cobble is NOT a datacenter

AI inference that doesn't cost the earth. We run large models on recycled hardware, powered entirely by renewable energy.

Curated catalog · Quantized & benchmarked
Generative AI

Qwen3.5 122B A10B

FLAGSHIP
Generative AI

Qwen3.6 27B

Generative AI

Qwen3.6 35B A3B

Generative AI

Gemma4 31B

Generative AI

Qwen3.5 9B

Generative AI

Nemotron 3 Nano Omni 30B A3B

Generative AI

Gemma4 26B A4B

OCR

Chandra OCR

FEATURED
OCR

GLM-OCR

OCR

DeepSeek OCR2

OCR

Nemotron OCR v2

Embeddings

Nomic Embed 1.5

FEATURED
Embeddings

Granite Embedding 311M

Embeddings

Granite Embedding 97M

Embeddings

Qwen3 Embedding 0.6B

Embeddings

Qwen3 Embedding 8B

Generative AI

Qwen3.5 122B A10B

FLAGSHIP
Generative AI

Qwen3.6 27B

Generative AI

Qwen3.6 35B A3B

Generative AI

Gemma4 31B

Generative AI

Qwen3.5 9B

Generative AI

Nemotron 3 Nano Omni 30B A3B

Generative AI

Gemma4 26B A4B

OCR

Chandra OCR

FEATURED
OCR

GLM-OCR

OCR

DeepSeek OCR2

OCR

Nemotron OCR v2

Embeddings

Nomic Embed 1.5

FEATURED
Embeddings

Granite Embedding 311M

Embeddings

Granite Embedding 97M

Embeddings

Qwen3 Embedding 0.6B

Embeddings

Qwen3 Embedding 8B

The Cobble difference

Not your typical inference provider

Traditional AI providers burn megawatts in massive datacenters. We built something different.

vs
Traditional
Infrastructure
Massive datacenters
Distributed edge nodes
Power Source
Grid-dependent megawatts
Solar + wind energy
Cooling
Water cooling systems
Passive air cooling
Hardware
Proprietary enterprise GPUs
Recycled consumer GPUs
Carbon Footprint
High emissions
Carbon negative
Model Focus
Full precision only
Optimized quantized models
Receipts, not promises

Built to do better

Every component was sourced, recycled, and repurposed.

0
Water Usage
0%
Green Energy
0%
Recycled Hardware
-0x
Carbon Footprint
Open numbers · Open weights · Open methodology

Quantized. Benchmarked. Real.

We publish what others hide. Every model is tested, quantized, and documented.

Qwen3.5 122B A10B

Q4_K_M
Quality retention94%
Memory
28 GB
Speed
12 tok/s

Qwen3.6 35B A3B

Q5_K_M
Quality retention96%
Memory
18 GB
Speed
18 tok/s

Gemma4 31B

Q4_K_M
Quality retention93%
Memory
16 GB
Speed
20 tok/s

Nemotron 3 Nano Omni

Q5_K_M
Quality retention95%
Memory
17 GB
Speed
16 tok/s
Three steps to inference

How it works

01

Choose a model

Pick from our curated selection of quantized models optimized for speed and quality on recycled hardware.

02

Send a request

Use our OpenAI-compatible API. Drop-in replacement for your existing inference pipeline.

03

Get results

Low-latency responses powered by distributed edge nodes running on green energy.

Ready to route smarter?

Join the inference network that gives back to the planet.