Skip to content

ENTERPILOT/GoModel

Repository files navigation

GoModel logo

GoModel - AI Gateway in Go

CI GO Version Docker Pulls Discord

Hacker News docs GoModel

GoModel on Hacker News

A fast and lightweight AI gateway written in Go, providing a unified OpenAI-compatible API for OpenAI, Anthropic, Gemini, DeepSeek, xAI, Groq, OpenRouter, Z.ai, Azure OpenAI, Oracle, Ollama, and more.

GoModel AI gateway dashboard showing AI usage analytics, observability panel, token and costs tracking, and estimated cost monitoring

Quick Start with Docker

Step 1: Start GoModel container

docker run --rm -p 8080:8080 \
  -e LOGGING_ENABLED=true \
  -e LOGGING_LOG_BODIES=true \
  -e LOG_FORMAT=text \
  -e LOGGING_LOG_HEADERS=true \
  -e OPENAI_API_KEY="your-openai-key" \
  enterpilot/gomodel

Pass only the provider credentials or base URL you need (at least one required):

docker run --rm -p 8080:8080 \
  -e OPENAI_API_KEY="your-openai-key" \
  -e ANTHROPIC_API_KEY="your-anthropic-key" \
  -e GEMINI_API_KEY="your-gemini-key" \
  -e DEEPSEEK_API_KEY="your-deepseek-key" \
  -e GROQ_API_KEY="your-groq-key" \
  -e OPENROUTER_API_KEY="your-openrouter-key" \
  -e ZAI_API_KEY="your-zai-key" \
  -e XAI_API_KEY="your-xai-key" \
  -e AZURE_API_KEY="your-azure-key" \
  -e AZURE_BASE_URL="https://your-resource.openai.azure.com/openai/deployments/your-deployment" \
  -e AZURE_API_VERSION="2024-10-21" \
  -e ORACLE_API_KEY="your-oracle-key" \
  -e ORACLE_BASE_URL="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com/20231130/actions/v1" \
  -e ORACLE_MODELS="openai.gpt-oss-120b,xai.grok-3" \
  -e OLLAMA_BASE_URL="http://host.docker.internal:11434/v1" \
  -e VLLM_BASE_URL="http://host.docker.internal:8000/v1" \
  enterpilot/gomodel

⚠️ Avoid passing secrets via -e on the command line - they can leak via shell history and process lists. For production, use docker run --env-file .env to load API keys from a file instead.

Step 2: Make your first API call

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5-chat-latest",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

That's it! GoModel automatically detects which providers are available based on the credentials you supply.

Supported LLM Providers

Example model identifiers are illustrative and subject to change; consult provider catalogs for current models. Feature columns reflect gateway API support, not every individual model capability exposed by an upstream provider.

Provider Credential Example Model Chat /responses Embed Files Batches Passthru
OpenAI OPENAI_API_KEY gpt-5.5
Anthropic ANTHROPIC_API_KEY claude-sonnet-4-20250514
Google Gemini GEMINI_API_KEY gemini-2.5-flash
DeepSeek DEEPSEEK_API_KEY deepseek-v4-pro
Groq GROQ_API_KEY llama-3.3-70b-versatile
OpenRouter OPENROUTER_API_KEY google/gemini-2.5-flash
Z.ai ZAI_API_KEY (ZAI_BASE_URL optional) glm-5.1
xAI (Grok) XAI_API_KEY grok-4
Azure OpenAI AZURE_API_KEY + AZURE_BASE_URL (AZURE_API_VERSION optional) gpt-5
Oracle ORACLE_API_KEY + ORACLE_BASE_URL openai.gpt-oss-120b
Ollama OLLAMA_BASE_URL llama3.2
vLLM VLLM_BASE_URL (VLLM_API_KEY optional) meta-llama/Llama-3.1-8B-Instruct

✅ Supported ❌ Unsupported

For Z.ai's GLM Coding Plan, set ZAI_BASE_URL=https://api.z.ai/api/coding/paas/v4. Configured model lists are available for every provider with <PROVIDER>_MODELS, for example OPENROUTER_MODELS=openai/gpt-oss-120b,anthropic/claude-sonnet-4 or ORACLE_MODELS=openai.gpt-oss-120b,xai.grok-3. DeepSeek defaults to https://api.deepseek.com; set DEEPSEEK_BASE_URL only when using a compatible proxy or alternate DeepSeek endpoint. By default, CONFIGURED_PROVIDER_MODELS_MODE=fallback uses those lists only when upstream /models is unavailable or empty. Set CONFIGURED_PROVIDER_MODELS_MODE=allowlist to expose only configured models for providers that define a list, skipping their upstream /models calls. For vLLM, set VLLM_API_KEY only if the upstream server was started with --api-key. To register multiple instances of the same provider type without config.yaml, use suffixed env vars such as OPENAI_EAST_API_KEY and OPENAI_EAST_BASE_URL; add OPENAI_EAST_MODELS to configure that instance's model list. This registers provider openai-east with type openai.


Alternative Setup Methods

Running from Source

Prerequisites: Go 1.26.2+

  1. Create a .env file:

    cp .env.template .env
  2. Add your API keys to .env (at least one required).

  3. Start the server:

    make run

Docker Compose

Infrastructure only (Redis, PostgreSQL, MongoDB, Adminer - no image build):

docker compose up -d
# or: make infra

Full stack (adds GoModel + Prometheus; builds the app image):

cp .env.template .env
# Add your API keys to .env
docker compose --profile app up -d
# or: make image
Service URL
GoModel API http://localhost:8080
Adminer (DB UI) http://localhost:8081
Prometheus http://localhost:9090

Building the Docker Image Locally

docker build -t gomodel .
docker run --rm -p 8080:8080 --env-file .env gomodel

API Endpoints

OpenAI-Compatible API

Endpoint Method Description
/v1/chat/completions POST Chat completions (streaming supported)
/v1/responses POST OpenAI Responses API
/v1/embeddings POST Text embeddings
/v1/models GET List available models
/v1/files POST Upload a file (OpenAI-compatible multipart)
/v1/files GET List files
/v1/files/{id} GET Retrieve file metadata
/v1/files/{id} DELETE Delete a file
/v1/files/{id}/content GET Retrieve raw file content
/v1/batches POST Create a native provider batch (OpenAI-compatible schema; inline requests supported where provider-native)
/v1/batches GET List stored batches
/v1/batches/{id} GET Retrieve one stored batch
/v1/batches/{id}/cancel POST Cancel a pending batch
/v1/batches/{id}/results GET Retrieve native batch results when available

Provider Passthrough

Endpoint Method Description
/p/{provider}/... GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS Provider-native passthrough with opaque upstream responses

Admin Endpoints

Endpoint Method Description
/admin/dashboard GET Admin dashboard UI
/admin/api/v1/dashboard/config GET Dashboard configuration
/admin/api/v1/cache/overview GET Cache statistics overview
/admin/api/v1/usage/summary GET Aggregate token usage statistics
/admin/api/v1/usage/daily GET Per-period token usage breakdown
/admin/api/v1/usage/models GET Usage breakdown by model
/admin/api/v1/usage/user-paths GET Usage breakdown by user path
/admin/api/v1/usage/log GET Paginated usage log entries
/admin/api/v1/audit/log GET Paginated audit log entries
/admin/api/v1/audit/conversation GET Conversation thread around one audit entry
/admin/api/v1/providers/status GET Provider availability status
/admin/api/v1/runtime/refresh POST Refresh runtime configuration
/admin/api/v1/models GET List models with provider type
/admin/api/v1/models/categories GET List model categories
/admin/api/v1/model-overrides GET List model overrides
/admin/api/v1/model-overrides/:selector PUT Create/update model override
/admin/api/v1/model-overrides/:selector DELETE Remove model override
/admin/api/v1/auth-keys GET List authentication keys

Operations Endpoints

Endpoint Method Description
/health GET Health check
/metrics GET Prometheus metrics (experimental, when enabled)
/swagger/index.html GET Swagger UI (when enabled)

Gateway Configuration

GoModel is configured through environment variables and an optional config.yaml. Environment variables override YAML values. See .env.template and config/config.example.yaml for the available options.

Key settings:

Variable Default Description
PORT 8080 Server port
BASE_PATH / Mount the gateway under a path prefix such as /g
GOMODEL_MASTER_KEY (none) API key for authentication
ENABLE_PASSTHROUGH_ROUTES true Enable provider-native passthrough routes under /p/{provider}/...
ALLOW_PASSTHROUGH_V1_ALIAS true Allow /p/{provider}/v1/... aliases while keeping /p/{provider}/... canonical
ENABLED_PASSTHROUGH_PROVIDERS openai,anthropic,openrouter,zai,vllm Comma-separated list of enabled passthrough providers
STORAGE_TYPE sqlite Storage backend (sqlite, postgresql, mongodb)
METRICS_ENABLED false Enable Prometheus metrics (experimental)
LOGGING_ENABLED false Enable audit logging
GUARDRAILS_ENABLED false Enable the configured guardrails pipeline

Quick Start - Authentication: By default GOMODEL_MASTER_KEY is unset. Without this key, API endpoints are unprotected and anyone can call them. This is insecure for production. Strongly recommend setting a strong secret before exposing the service. Add GOMODEL_MASTER_KEY to your .env or environment for production deployments.


Response Caching

GoModel has a two-layer response cache that reduces LLM API costs and latency for repeated or semantically similar requests.

Layer 1 - Exact-match cache

Hashes the full request body (path + Workflow + body) and returns a stored response on byte-identical requests. Sub-millisecond lookup. Activate by environment variables: RESPONSE_CACHE_SIMPLE_ENABLED and REDIS_URL.

Responses served from this layer carry X-Cache: HIT (exact).

Layer 2 - Semantic cache

Embeds the last user message via your configured provider’s OpenAI-compatible /v1/embeddings API (cache.response.semantic.embedder.provider must name a key in the top-level providers map) and performs a KNN vector search. Semantically equivalent queries - e.g. "What's the capital of France?" vs "Which city is France's capital?" - can return the same cached response without an upstream LLM call.

Expected hit rates: ~60–70% in high-repetition workloads vs. ~18% for exact-match alone.

Responses served from this layer carry X-Cache: HIT (semantic).

Supported vector backends: qdrant, pgvector, pinecone, weaviate (set cache.response.semantic.vector_store.type and the matching nested block).

Both cache layers run after guardrail/workflow patching so they always see the final prompt. Use Cache-Control: no-cache or Cache-Control: no-store to bypass caching per-request.


See DEVELOPMENT.md for testing, linting, and pre-commit setup.


Roadmap to 0.2.0

Must Have

  • Intelligent routing
  • Broader provider support: Cohere, Command A, and Operational
  • Budget management with limits per user_path and/or API key
  • Editable model pricing for accurate cost tracking and budgeting
  • Full support for the OpenAI /responses and /conversations lifecycle
  • Prompt cache visibility showing how much of each prompt was cached by the provider
  • Guardrails hardening: better UI, simpler architecture, easier custom guardrails, and response-side guardrails before output reaches the client
  • Passthrough for all providers, beyond the current OpenAI and Anthropic beta
  • Fix failover charts in the dashboard

Should Have

  • Cluster mode

Community

Join our Discord to connect with other GoModel users.

Star History

Star History Chart

About

AI gateway written in Go. Lightweight unified OpenAI-compatible API for OpenAI, Anthropic, Gemini, Groq, xAI & Ollama. LiteLLM alternative with observability, guardrails, streaming, costs and usage tracking.

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Sponsor this project

 

Contributors