Litellm

App in the BluixApps catalog

What it is

LiteLLM is an OpenAI-compatible proxy that fronts 100+ LLM providers — OpenAI, Anthropic, Google, Mistral, Cohere, Hugging Face, Ollama, AWS Bedrock, Azure, and dozens more. Your code calls litellm.completion() (or the proxy's OpenAI-compatible REST endpoint) and LiteLLM routes to the actual provider with retries, fallbacks, cost tracking, and load balancing.

It's the "LLM router" pattern — the one piece of infrastructure that makes provider-switching painless.

What it's for

  • Provider-agnostic LLM apps — write code once, switch backends via config
  • Cost optimization — route cheap queries to cheaper models automatically
  • High availability — fallback chains across providers when one is down
  • Budget enforcement — per-user / per-team spend limits with alerts
  • Migration — gradual swap from OpenAI to Anthropic without code changes

Who it's for

  • AI platform teams standardizing LLM access across the org
  • Enterprises needing audit trail + budget enforcement on LLM usage
  • Multi-LLM apps wanting to A/B test providers without code refactor
  • Cost-conscious startups routing dev traffic to cheap providers, prod to premium
  • Resellers offering "OpenAI-compatible API" while proxying to multiple backends

Why teams pick LiteLLM over alternatives

  • OpenAI-compatible API — every OpenAI SDK works without code changes
  • 100+ providers — most comprehensive LLM router in OSS
  • Cost + token tracking — built-in spend analytics per key
  • Routing rules — match queries to models by cost, latency, region
  • MIT license — clean for commercial / production
  • Active development — releases multiple times per week

Integrations

  • LLM providers — OpenAI, Anthropic, Google, AWS Bedrock, Azure, Mistral, Cohere, HuggingFace, Ollama, vLLM, custom
  • Observability — Langfuse, Helicone, OpenTelemetry, Prometheus metrics
  • Caching — Redis-backed response cache to avoid duplicate API calls
  • Auth — JWT, API keys with per-key rate limits + budgets
  • Database — Postgres for spend tracking, key management, audit log
  • Admin UI — built-in dashboard for keys, costs, model usage
  • SDK clients — Python (native), JS via OpenAI SDK pointed at proxy

Notable users & community

  • 15k+ GitHub stars
  • Adopted by major AI platform teams as standard LLM router
  • Featured in enterprise AI architecture guides
  • Backed by BerriAI with active commercial enterprise offering
  • Strong Discord, weekly releases, predictable roadmap

Tips & operations

  • Always run with database — without Postgres, spend tracking + key management don't persist
  • Set budgets per key — without budget caps, a single buggy client can rack up huge bills
  • Use response caching — Redis cache on identical prompts saves significant cost
  • Monitor via Langfuse — built-in LiteLLM → Langfuse integration captures every call for debugging
  • Health check models — LiteLLM's health endpoint pings each provider; integrate with uptime monitoring
  • Update frequently — provider APIs change; LiteLLM releases track them; stale versions = silent failures

What we ship in BluixApps

  • Docker compose: LiteLLM proxy + Postgres + Redis
  • Pinned ghcr.io/berriai/litellm:latest (locked to release tag)
  • HTTPS via Let's Encrypt; admin UI with random master key
  • Pre-configured for Ollama detection on same VPS
  • Postgres for spend tracking + key persistence
  • Redis for response caching
  • Backup hook covers Postgres (keys + spend history)
Read this app's deep dive on bluix.app ↗

Get this app — pick a BluixApps plan

Same catalog. Scaling tenant isolation, white-label and support tier.

TierTenantsCatalogSupportWhite-labelMonthly
Stacks119 curated stacksStandard$19/moDetailDeploy
Starter10Full catalogStandard+$15–25/mo$49/moDetailDeploy
Pro25Full catalogPriority bugfix+$15–25/mo$149/moDetailDeploy
Growth100Full catalogPriority bugfix+$15–25/mo$349/moDetailDeploy
Scale500Full catalog7-day window+$15–25/mo$799/moDetailDeploy
EnterpriseUnlimitedFull catalogPriority 7-dayBundled$1,499/moDetailDeploy

Powered by WHMCompleteSolution