mitsuhiko

An experimental pi extension that runs and manages qwen with llama.cpp

79
5
100% credibility
Found May 10, 2026 at 80 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
TypeScript
AI Summary

An extension for the Pi coding agent that enables running large local AI models using llama.cpp for self-hosted inference.

How It Works

1
🔍 Discover Local AI Boost

You hear about a handy add-on that lets your Pi coding helper use super-smart AI brains running right on your own computer, keeping everything private and fast.

2
📥 Add the Add-On

With one simple click or line in Pi, you bring this add-on into your setup, and it nestles in quietly.

3
🔄 Refresh Pi

You tell Pi to reload, and it welcomes the new friend without any fuss.

4
🧠 Pick Your Smart Brain

New powerful thinking options appear in Pi, like different sizes of clever minds tailored for coding magic.

5
It Sets Up Automatically

Behind the scenes, Pi grabs what it needs and gets your chosen brain ready to think, all while you sip coffee.

Code with Local Genius

Now your Pi chats with this local whiz, giving you spot-on coding advice instantly, privately, and without waiting on the internet.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 80 to 79 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is pi-llamacpp?

This experimental TypeScript extension for the Pi coding agent brings local llama.cpp inference right into your workflow, registering quantized Qwen3.6 models (dense 27B and MoE 35B-A3B) under the `llamacpp` provider. It auto-downloads GGUF files from Hugging Face, builds a compatible llama.cpp runtime (with MTP/NextN support), spins up `llama-server` on demand, and shuts it down when Pi quits. Install with `pi install https://github.com/mitsuhiko/pi-llamacpp`, then use `/llamacpp status` or `/llamacpp` for logs.

Why is it gaining traction?

Unlike cloud-dependent Pi providers, it delivers offline, self-managed Qwen inference with automatic model switching, port allocation, and lease-based server control—perfect for uninterrupted coding sessions. Devs dig the zero-config setup for 2/4/8-bit quants, reasoning support via DeepSeek format, and watchdog for clean shutdowns, standing out amid experimental GitHub projects like langchain_experimental or ue4ss extensions.

Who should use this?

Pi coding agent users building local AI tools, especially those tweaking agent workflows with Qwen's MoE efficiency for code gen or debugging. Ideal for backend devs or AI tinkerers avoiding API costs and latency in experimental setups akin to scratch extensions or proton branches.

Verdict

Grab it if you're deep in Pi and want local Qwen power—docs are crisp, commands intuitive—but with 79 stars and 1.0% credibility, it's raw experimental tech; test thoroughly before production reliance.

(187 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.