Generative AI
Curated catalog · Quantized & benchmarked
Models Available From Cobble
Generative AI
Qwen3.6 27B
Generative AI
Qwen3.6 35B A3B
Generative AI
Gemma4 31B
Generative AI
Qwen3.5 9B
Generative AI
Nemotron 3 Nano Omni 30B A3B
Generative AI
Gemma4 26B A4B
OCR
Chandra OCR
FEATUREDOCR
GLM-OCR
OCR
DeepSeek OCR2
OCR
Nemotron OCR v2
Embeddings
Nomic Embed 1.5
FEATUREDEmbeddings
Granite Embedding 311M
Embeddings
Granite Embedding 97M
Embeddings
Qwen3 Embedding 0.6B
Embeddings
Qwen3 Embedding 8B
Generative AI
Qwen3.5 122B A10B
FLAGSHIPGenerative AI
Qwen3.6 27B
Generative AI
Qwen3.6 35B A3B
Generative AI
Gemma4 31B
Generative AI
Qwen3.5 9B
Generative AI
Nemotron 3 Nano Omni 30B A3B
Generative AI
Gemma4 26B A4B
OCR
Chandra OCR
FEATUREDOCR
GLM-OCR
OCR
DeepSeek OCR2
OCR
Nemotron OCR v2
Embeddings
Nomic Embed 1.5
FEATUREDEmbeddings
Granite Embedding 311M
Embeddings
Granite Embedding 97M
Embeddings
Qwen3 Embedding 0.6B
Embeddings
Qwen3 Embedding 8B
The Cobble difference
Not your typical inference provider
Traditional AI providers burn megawatts in massive datacenters. We built something different.
vs
Traditional
Cobble
Infrastructure
Massive datacenters
Distributed edge nodes
Power Source
Grid-dependent megawatts
Solar + wind energy
Cooling
Water cooling systems
Passive air cooling
Hardware
Proprietary enterprise GPUs
Recycled consumer GPUs
Carbon Footprint
High emissions
Carbon negative
Model Focus
Full precision only
Optimized quantized models
Receipts, not promises
Built to do better
Every component was sourced, recycled, and repurposed.
0
Water Usage
0%
Green Energy
0%
Recycled Hardware
-0x
Carbon Footprint
Open numbers · Open weights · Open methodology
Quantized. Benchmarked. Real.
We publish what others hide. Every model is tested, quantized, and documented.
Qwen3.5 122B A10B
Q4_K_MQuality retention94%
Memory
28 GB
Speed
12 tok/s
Qwen3.6 35B A3B
Q5_K_MQuality retention96%
Memory
18 GB
Speed
18 tok/s
Gemma4 31B
Q4_K_MQuality retention93%
Memory
16 GB
Speed
20 tok/s
Nemotron 3 Nano Omni
Q5_K_MQuality retention95%
Memory
17 GB
Speed
16 tok/s
Three steps to inference
How it works
01
Choose a model
Pick from our curated selection of quantized models optimized for speed and quality on recycled hardware.
02
Send a request
Use our OpenAI-compatible API. Drop-in replacement for your existing inference pipeline.
03
Get results
Low-latency responses powered by distributed edge nodes running on green energy.



















