Quantized API

Quantized provides a single, unified API to access multiple AI providers and tools. One API key, one credit balance, one consistent interface — regardless of which provider powers the request.

What you can do

Chat Completions

Generate text with any LLM. Compatible with the OpenAI Chat Completions format.
Supports text, vision, tool calling, structured output, reasoning, and streaming.

POST /v1/chat/completions

Responses

Stateful text generation using the Responses API format.
Supports instructions, tools, reasoning, and streaming.

POST /v1/responses

Embeddings

Generate vector embeddings for text. Compatible with the OpenAI Embeddings format.

POST /v1/embeddings

Image Generation

Generate images from text prompts. DALL-E 2/3 and gpt-image-1 today; Bedrock Titan/Nova/Stability and Google Imagen catalog-seeded for ops to activate.

POST /v1/images/generations

Web Search

Search the web and get structured results.

POST /v1/web-search

Fetch

Extract clean text content from any URL.

POST /v1/fetch

Supported providers

Provider	Capabilities
OpenRouter	LLMs (GPT-4, Claude, Llama, Gemini, …), Responses API, Embeddings (passthrough)
OpenAI Direct	Embeddings (default), Image generation (default — DALL-E 2/3, gpt-image-1)
Anthropic	Claude models directly
AWS Bedrock	Chat completions, Responses, Bedrock-native embeddings, Image generation (Titan / Nova Canvas / Stability)
Google Gemini	Gemini-native embeddings, Image generation (Imagen 4, Gemini Flash Image)
Exa	Web search, content extraction
Tavily	Web search, content extraction

You don’t need to manage separate API keys or accounts for each provider. Quantized handles provider routing, authentication, and billing automatically.

Get started

Get your API key

Your institution provides you with a Quantized API key (format: sk-quantized-...) or a JWT token.

Make your first request

curl -X POST https://api.quantized.us/v1/chat/completions \
  -H "Authorization: Bearer sk-quantized-YOUR-KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4.1-mini",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Explore the docs

Read the Quickstart guide or jump to the API Reference.

Key features

OpenAI-compatible — Use the OpenAI SDK with base_url="https://api.quantized.us/v1"
Unified billing — One credit balance across all providers and tools
Provider routing — Force a specific provider with the X-Quantized-Provider header, or let Quantized pick the default
Streaming — Server-Sent Events for real-time token delivery
Transparent pricing — Every response includes credits_used and credits_remaining

Quantized API

What you can do

Supported providers

Get started

Key features

On This Page