GoModel - AI Gateway in Go

A fast and lightweight AI gateway written in Go, providing a unified OpenAI-compatible API for OpenAI, Anthropic, Gemini, DeepSeek, xAI, Groq, OpenRouter, Z.ai, Azure OpenAI, Oracle, Ollama, and more.

GoModel AI gateway dashboard showing AI usage analytics, observability panel, token and costs tracking, and estimated cost monitoring

Quick Start with Docker

Step 1: Start GoModel container

docker run --rm -p 8080:8080 \
  -e LOGGING_ENABLED=true \
  -e LOGGING_LOG_BODIES=true \
  -e LOG_FORMAT=text \
  -e LOGGING_LOG_HEADERS=true \
  -e OPENAI_API_KEY="your-openai-key" \
  enterpilot/gomodel

Pass only the provider credentials or base URL you need (at least one required):

docker run --rm -p 8080:8080 \
  -e OPENAI_API_KEY="your-openai-key" \
  -e ANTHROPIC_API_KEY="your-anthropic-key" \
  -e GEMINI_API_KEY="your-gemini-key" \
  -e DEEPSEEK_API_KEY="your-deepseek-key" \
  -e GROQ_API_KEY="your-groq-key" \
  -e OPENROUTER_API_KEY="your-openrouter-key" \
  -e ZAI_API_KEY="your-zai-key" \
  -e XAI_API_KEY="your-xai-key" \
  -e AZURE_API_KEY="your-azure-key" \
  -e AZURE_BASE_URL="https://your-resource.openai.azure.com/openai/deployments/your-deployment" \
  -e AZURE_API_VERSION="2024-10-21" \
  -e ORACLE_API_KEY="your-oracle-key" \
  -e ORACLE_BASE_URL="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com/20231130/actions/v1" \
  -e ORACLE_MODELS="openai.gpt-oss-120b,xai.grok-3" \
  -e OLLAMA_BASE_URL="http://host.docker.internal:11434/v1" \
  -e VLLM_BASE_URL="http://host.docker.internal:8000/v1" \
  enterpilot/gomodel

⚠️ Avoid passing secrets via -e on the command line - they can leak via shell history and process lists. For production, use docker run --env-file .env to load API keys from a file instead.

Step 2: Make your first API call

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5-chat-latest",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

That's it! GoModel automatically detects which providers are available based on the credentials you supply.

Supported LLM Providers

Example model identifiers are illustrative and subject to change; consult provider catalogs for current models. Feature columns reflect gateway API support, not every individual model capability exposed by an upstream provider.

Provider	Credential	Example Model	Chat	`/responses`	Embed	Files	Batches	Passthru
OpenAI	`OPENAI_API_KEY`	`gpt-5.5`	✅	✅	✅	✅	✅	✅
Anthropic	`ANTHROPIC_API_KEY`	`claude-sonnet-4-20250514`	✅	✅	❌	❌	✅	✅
Google Gemini	`GEMINI_API_KEY`	`gemini-2.5-flash`	✅	✅	✅	✅	✅	❌
DeepSeek	`DEEPSEEK_API_KEY`	`deepseek-v4-pro`	✅	✅	❌	❌	❌	❌
Groq	`GROQ_API_KEY`	`llama-3.3-70b-versatile`	✅	✅	✅	✅	✅	❌
OpenRouter	`OPENROUTER_API_KEY`	`google/gemini-2.5-flash`	✅	✅	✅	✅	✅	✅
Z.ai	`ZAI_API_KEY` (`ZAI_BASE_URL` optional)	`glm-5.1`	✅	✅	✅	❌	❌	✅
xAI (Grok)	`XAI_API_KEY`	`grok-4`	✅	✅	✅	✅	✅	❌
Azure OpenAI	`AZURE_API_KEY` + `AZURE_BASE_URL` (`AZURE_API_VERSION` optional)	`gpt-5`	✅	✅	✅	✅	✅	✅
Oracle	`ORACLE_API_KEY` + `ORACLE_BASE_URL`	`openai.gpt-oss-120b`	✅	✅	❌	❌	❌	❌
Ollama	`OLLAMA_BASE_URL`	`llama3.2`	✅	✅	✅	❌	❌	❌
vLLM	`VLLM_BASE_URL` (`VLLM_API_KEY` optional)	`meta-llama/Llama-3.1-8B-Instruct`	✅	✅	✅	❌	❌	✅

✅ Supported ❌ Unsupported

For Z.ai's GLM Coding Plan, set ZAI_BASE_URL=https://api.z.ai/api/coding/paas/v4. Configured model lists are available for every provider with <PROVIDER>_MODELS, for example OPENROUTER_MODELS=openai/gpt-oss-120b,anthropic/claude-sonnet-4 or ORACLE_MODELS=openai.gpt-oss-120b,xai.grok-3. DeepSeek defaults to https://api.deepseek.com; set DEEPSEEK_BASE_URL only when using a compatible proxy or alternate DeepSeek endpoint. By default, CONFIGURED_PROVIDER_MODELS_MODE=fallback uses those lists only when upstream /models is unavailable or empty. Set CONFIGURED_PROVIDER_MODELS_MODE=allowlist to expose only configured models for providers that define a list, skipping their upstream /models calls. For vLLM, set VLLM_API_KEY only if the upstream server was started with --api-key. To register multiple instances of the same provider type without config.yaml, use suffixed env vars such as OPENAI_EAST_API_KEY and OPENAI_EAST_BASE_URL; add OPENAI_EAST_MODELS to configure that instance's model list. This registers provider openai-east with type openai.

Alternative Setup Methods

Running from Source

Prerequisites: Go 1.26.2+

Create a .env file:
```
cp .env.template .env
```
Add your API keys to .env (at least one required).
Start the server:
```
make run
```

Docker Compose

Infrastructure only (Redis, PostgreSQL, MongoDB, Adminer - no image build):

docker compose up -d
# or: make infra

Full stack (adds GoModel + Prometheus; builds the app image):

cp .env.template .env
# Add your API keys to .env
docker compose --profile app up -d
# or: make image

Service	URL
GoModel API	http://localhost:8080
Adminer (DB UI)	http://localhost:8081
Prometheus	http://localhost:9090

Building the Docker Image Locally

docker build -t gomodel .
docker run --rm -p 8080:8080 --env-file .env gomodel

API Endpoints

OpenAI-Compatible API

Endpoint	Method	Description
`/v1/chat/completions`	POST	Chat completions (streaming supported)
`/v1/responses`	POST	OpenAI Responses API
`/v1/embeddings`	POST	Text embeddings
`/v1/models`	GET	List available models
`/v1/files`	POST	Upload a file (OpenAI-compatible multipart)
`/v1/files`	GET	List files
`/v1/files/{id}`	GET	Retrieve file metadata
`/v1/files/{id}`	DELETE	Delete a file
`/v1/files/{id}/content`	GET	Retrieve raw file content
`/v1/batches`	POST	Create a native provider batch (OpenAI-compatible schema; inline `requests` supported where provider-native)
`/v1/batches`	GET	List stored batches
`/v1/batches/{id}`	GET	Retrieve one stored batch
`/v1/batches/{id}/cancel`	POST	Cancel a pending batch
`/v1/batches/{id}/results`	GET	Retrieve native batch results when available

Provider Passthrough

Endpoint	Method	Description
`/p/{provider}/...`	GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS	Provider-native passthrough with opaque upstream responses

Admin Endpoints

Endpoint	Method	Description
`/admin/dashboard`	GET	Admin dashboard UI
`/admin/api/v1/dashboard/config`	GET	Dashboard configuration
`/admin/api/v1/cache/overview`	GET	Cache statistics overview
`/admin/api/v1/usage/summary`	GET	Aggregate token usage statistics
`/admin/api/v1/usage/daily`	GET	Per-period token usage breakdown
`/admin/api/v1/usage/models`	GET	Usage breakdown by model
`/admin/api/v1/usage/user-paths`	GET	Usage breakdown by user path
`/admin/api/v1/usage/log`	GET	Paginated usage log entries
`/admin/api/v1/audit/log`	GET	Paginated audit log entries
`/admin/api/v1/audit/conversation`	GET	Conversation thread around one audit entry
`/admin/api/v1/providers/status`	GET	Provider availability status
`/admin/api/v1/runtime/refresh`	POST	Refresh runtime configuration
`/admin/api/v1/models`	GET	List models with provider type
`/admin/api/v1/models/categories`	GET	List model categories
`/admin/api/v1/model-overrides`	GET	List model overrides
`/admin/api/v1/model-overrides/:selector`	PUT	Create/update model override
`/admin/api/v1/model-overrides/:selector`	DELETE	Remove model override
`/admin/api/v1/auth-keys`	GET	List authentication keys

Operations Endpoints

Endpoint	Method	Description
`/health`	GET	Health check
`/metrics`	GET	Prometheus metrics (experimental, when enabled)
`/swagger/index.html`	GET	Swagger UI (when enabled)

Gateway Configuration

GoModel is configured through environment variables and an optional config.yaml. Environment variables override YAML values. See .env.template and config/config.example.yaml for the available options.

Key settings:

Variable	Default	Description
`PORT`	`8080`	Server port
`BASE_PATH`	`/`	Mount the gateway under a path prefix such as `/g`
`GOMODEL_MASTER_KEY`	(none)	API key for authentication
`ENABLE_PASSTHROUGH_ROUTES`	`true`	Enable provider-native passthrough routes under `/p/{provider}/...`
`ALLOW_PASSTHROUGH_V1_ALIAS`	`true`	Allow `/p/{provider}/v1/...` aliases while keeping `/p/{provider}/...` canonical
`ENABLED_PASSTHROUGH_PROVIDERS`	`openai,anthropic,openrouter,zai,vllm`	Comma-separated list of enabled passthrough providers
`STORAGE_TYPE`	`sqlite`	Storage backend (`sqlite`, `postgresql`, `mongodb`)
`METRICS_ENABLED`	`false`	Enable Prometheus metrics (experimental)
`LOGGING_ENABLED`	`false`	Enable audit logging
`GUARDRAILS_ENABLED`	`false`	Enable the configured guardrails pipeline

Quick Start - Authentication: By default GOMODEL_MASTER_KEY is unset. Without this key, API endpoints are unprotected and anyone can call them. This is insecure for production. Strongly recommend setting a strong secret before exposing the service. Add GOMODEL_MASTER_KEY to your .env or environment for production deployments.

Response Caching

GoModel has a two-layer response cache that reduces LLM API costs and latency for repeated or semantically similar requests.

Layer 1 - Exact-match cache

Hashes the full request body (path + Workflow + body) and returns a stored response on byte-identical requests. Sub-millisecond lookup. Activate by environment variables: RESPONSE_CACHE_SIMPLE_ENABLED and REDIS_URL.

Responses served from this layer carry X-Cache: HIT (exact).

Layer 2 - Semantic cache

Embeds the last user message via your configured provider’s OpenAI-compatible /v1/embeddings API (cache.response.semantic.embedder.provider must name a key in the top-level providers map) and performs a KNN vector search. Semantically equivalent queries - e.g. "What's the capital of France?" vs "Which city is France's capital?" - can return the same cached response without an upstream LLM call.

Expected hit rates: ~60–70% in high-repetition workloads vs. ~18% for exact-match alone.

Responses served from this layer carry X-Cache: HIT (semantic).

Supported vector backends: qdrant, pgvector, pinecone, weaviate (set cache.response.semantic.vector_store.type and the matching nested block).

Both cache layers run after guardrail/workflow patching so they always see the final prompt. Use Cache-Control: no-cache or Cache-Control: no-store to bypass caching per-request.

See DEVELOPMENT.md for testing, linting, and pre-commit setup.

Roadmap to 0.2.0

Must Have

Intelligent routing
Broader provider support: Cohere, Command A, and Operational
Budget management with limits per user_path and/or API key
Editable model pricing for accurate cost tracking and budgeting
Full support for the OpenAI /responses and /conversations lifecycle
Prompt cache visibility showing how much of each prompt was cached by the provider
Guardrails hardening: better UI, simpler architecture, easier custom guardrails, and response-side guardrails before output reaches the client
Passthrough for all providers, beyond the current OpenAI and Anthropic beta
Fix failover charts in the dashboard

Name		Name	Last commit message	Last commit date
Latest commit History 421 Commits
.github		.github
cmd		cmd
config		config
docs		docs
helm		helm
internal		internal
tests		tests
tools		tools
.dockerignore		.dockerignore
.env.template		.env.template
.gitignore		.gitignore
.golangci.yml		.golangci.yml
.goreleaser.yaml		.goreleaser.yaml
.pre-commit-config.yaml		.pre-commit-config.yaml
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
docker-compose.yaml		docker-compose.yaml
go.mod		go.mod
go.sum		go.sum
prometheus.yml		prometheus.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GoModel - AI Gateway in Go

Quick Start with Docker

Supported LLM Providers

Alternative Setup Methods

Running from Source

Docker Compose

Building the Docker Image Locally

API Endpoints

OpenAI-Compatible API

Provider Passthrough

Admin Endpoints

Operations Endpoints

Gateway Configuration

Response Caching

Layer 1 - Exact-match cache

Layer 2 - Semantic cache

Roadmap to 0.2.0

Must Have

Should Have

Community

Star History

About

Uh oh!

Releases 26

Sponsor this project

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

GoModel - AI Gateway in Go

Quick Start with Docker

Supported LLM Providers

Alternative Setup Methods

Running from Source

Docker Compose

Building the Docker Image Locally

API Endpoints

OpenAI-Compatible API

Provider Passthrough

Admin Endpoints

Operations Endpoints

Gateway Configuration

Response Caching

Layer 1 - Exact-match cache

Layer 2 - Semantic cache

Roadmap to 0.2.0

Must Have

Should Have

Community

Star History

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 26

Sponsor this project

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages