tomaasz

Self-hosted LiteLLM proxy with auto-discovery of free/trial LLM models from OpenRouter, Groq, Gemini, Cerebras, SambaNova, Cohere, NVIDIA NIM and more

12
1
89% credibility
Found Apr 29, 2026 at 12 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

Self-hosted proxy that auto-discovers and unifies free-tier large language model APIs from multiple providers into a single OpenAI-compatible endpoint with smart routing groups.

How It Works

1
🔍 Discover free AI access

You find a helpful tool online that gathers powerful chat AIs from services offering free trials, letting you use them all in one place.

2
📱 Sign up for free AI services

You create accounts at friendly AI playgrounds like Groq or OpenRouter that give away free chat turns without needing a credit card.

3
🔗 Connect your free services

You link these free accounts to the tool, unlocking their smart brains for your personal use.

4
🚀 Launch your AI hub

You start the tool with a quick setup, and suddenly you have your own free AI server running right on your computer.

5
🖥️ Open the friendly dashboard

You visit a simple web page to see lists of ready-to-use AI personalities grouped by smarts, speed, or specialties.

6
💬 Pick and chat with an AI

You choose a group like 'smart' for top-quality answers or 'fast' for quick chats, then type your first question.

Enjoy endless free chats

Your messages get clever, helpful replies from the best free AIs, powering your projects or curiosity without any cost.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 12 to 12 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is litellm-free-models-proxy?

This Python-based self-hosted LiteLLM proxy auto-discovers free and trial LLM models from providers like OpenRouter, Groq, Gemini, Cerebras, SambaNova, Cohere, and NVIDIA NIM, exposing them via a single OpenAI-compatible API endpoint on port 4000. It solves the hassle of juggling multiple API keys and endpoints for cost-free inference by syncing models every 8 hours and load-balancing across keys. Fire it up with Docker Compose, add your keys to .env, and query routing groups like "smart" or "fast" from any OpenAI client, LangChain, or tools like Cursor.

Why is it gaining traction?

Unlike basic litellm self-hosted alternatives, it handles auto-discovery of free/trial models without manual config tweaks, pulling from provider APIs and community lists for fresh availability. Pre-built routing groups (reasoning, coder, vision) and direct prefixes (groq/llama-3.3-70b, gemini/flash) make switching seamless, while optional Postgres logging adds observability. Devs love the drop-in compatibility for self-hosted GitHub Copilot or Codespaces workflows without vendor lock-in.

Who should use this?

Indie devs prototyping AI apps on zero budget, backend teams routing free/trial inference in production pipelines, or frontend engineers integrating chat/vision models into apps via self-hosted GitHub runner Docker setups. Perfect for self-hosted GitHub Actions runners needing quick LLM access without paid tiers.

Verdict

Grab it for free LLM experimentation—excellent docs and one-command Docker start make it dead simple, despite 12 stars signaling early maturity and a 0.9% credibility score. Scale up once your free quotas hit limits.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.