Quantized API

Quantized provides a single, unified API to access multiple AI providers and tools. One API key, one credit balance, one consistent interface — regardless of which provider powers the request.

What you can do

Chat Completions

Generate text with any LLM. Compatible with the OpenAI Chat Completions format.
Supports text, vision, tool calling, structured output, reasoning, and streaming.

POST /v1/chat/completions

Responses

Stateful text generation using the Responses API format.
Supports instructions, tools, reasoning, and streaming.

POST /v1/responses

Embeddings

Generate vector embeddings for text. Compatible with the OpenAI Embeddings format.

POST /v1/embeddings

Image Generation

Generate images from text prompts. DALL-E 2/3 and gpt-image-1 today; Bedrock Titan/Nova/Stability and Google Imagen catalog-seeded for ops to activate.

POST /v1/images/generations

Web Search

Search the web and get structured results.

POST /v1/web-search

Fetch

Extract clean text content from any URL.

POST /v1/fetch

Supported providers

Provider Capabilities
OpenRouter LLMs (GPT-4, Claude, Llama, Gemini, …), Responses API, Embeddings (passthrough)
OpenAI Direct Embeddings (default), Image generation (default — DALL-E 2/3, gpt-image-1)
Anthropic Claude models directly
AWS Bedrock Chat completions, Responses, Bedrock-native embeddings, Image generation (Titan / Nova Canvas / Stability)
Google Gemini Gemini-native embeddings, Image generation (Imagen 4, Gemini Flash Image)
Exa Web search, content extraction
Tavily Web search, content extraction

You don’t need to manage separate API keys or accounts for each provider. Quantized handles provider routing, authentication, and billing automatically.

Get started

Get your API key

Your institution provides you with a Quantized API key (format: sk-quantized-...) or a JWT token.


Make your first request

curl -X POST https://api.quantized.us/v1/chat/completions \
  -H "Authorization: Bearer sk-quantized-YOUR-KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4.1-mini",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Explore the docs

Read the Quickstart guide or jump to the API Reference.

Key features

  • OpenAI-compatible — Use the OpenAI SDK with base_url="https://api.quantized.us/v1"
  • Unified billing — One credit balance across all providers and tools
  • Provider routing — Force a specific provider with the X-Quantized-Provider header, or let Quantized pick the default
  • Streaming — Server-Sent Events for real-time token delivery
  • Transparent pricing — Every response includes credits_used and credits_remaining