flightlesstux

Automatic prompt caching for Claude Code. Cuts token costs by up to 90% on repeated file reads, bug fix sessions, and long coding conversations - zero config.

23
0
100% credibility
Found Mar 13, 2026 at 23 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
TypeScript
AI Summary

A plugin for AI coding tools that automatically optimizes and tracks reuse of stable prompt content in Anthropic API calls to reduce costs.

How It Works

1
🔍 Discover the savings tool

While building an app that chats with an AI assistant, you learn about a helpful plugin that reuses repeated instructions to cut costs.

2
📥 Add it to your coding helper

In your favorite AI coding tool like Claude Code or Cursor, you simply paste two easy lines to install the plugin—no hassle needed.

3
⚙️ Tweak settings if you want

Optionally create a small note in your project folder to fine-tune how much content gets reused, but defaults work great.

4
Smartly prepare your chats

Feed your conversation setup to the tool, and it automatically marks stable parts like instructions or tools for reuse.

5
📊 Preview and track results

Check a quick analysis of what's reusable, then see real-time stats on how much you're saving with each chat turn.

6
💰 Watch costs drop dramatically

As you run multiple turns, you see huge savings—like 80-90% less on repeated content—proven in live tests and benchmarks.

🎉 Build cheaper, faster AI apps

Your app now runs AI conversations much more affordably, with easy visibility into savings, letting you focus on creating.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 23 to 23 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is prompt-caching?

This TypeScript MCP plugin automatically injects cache_control breakpoints into Anthropic API prompts, slashing token costs by up to 90% on stable content like system prompts, tool definitions, and large user messages in long coding sessions. It solves the pain of resending full conversation history every turn in apps built with the Anthropic SDK, delivering tools like optimize_messages for breakpoint insertion, get_cache_stats for real-time savings tracking, and analyze_cacheability for dry-run previews. Zero-config setup via npm or Claude Code's plugin marketplace keeps it dead simple.

Why is it gaining traction?

Unlike manual cache placement that risks misses and wastes the 1.25x write cost, it uses smart heuristics for automatic prompt optimization, proven by live benchmarks showing 80-92% savings on bug fixes and refactors. Developers dig the session stats dashboarding hit rates and costs, plus easy MCP integration for Cursor, Windsurf, Zed, or Continue.dev—far beyond basic automatic prompt engineering scripts. The included live test script verifies cache hits against real API calls, building trust fast.

Who should use this?

Anthropic SDK builders crafting agents or apps with repeated file reads, tool-heavy workflows, or multi-turn chats. Ideal for AI tool devs in Cursor or Zed needing visibility into cache performance without prompt engineering guesswork, or Node.js backend teams optimizing Claude Sonnet/Opus calls. Skip if you're just using Claude Code standalone—its built-in caching already handles that.

Verdict

Solid for early adopters tackling Anthropic costs, with strong docs, CI tests, and a live proof script, but at 23 stars and 1.0% credibility it's immature—watch for marketplace approval. Grab it if automatic prompt optimization fits your stack; otherwise, monitor for wider adoption.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.