19PINE-AI

19PINE-AI / ikp

Public

IKP: Incompressive Knowledge Probes

44
2
100% credibility
Found Apr 30, 2026 at 44 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

A benchmark of 1,400 factual questions across seven obscurity tiers that estimates black-box LLM parameter counts via log-linear scaling of accuracy with model size.

How It Works

1
🔍 Discover IKP

You hear about a clever way to guess the hidden size of AI models by testing what obscure facts they know.

2
🔑 Connect your AI

Link a service like OpenRouter so the tool can chat with any AI model you want to test.

3
🚀 Run the test

Pick a model and launch the quick quiz with 1,400 tricky questions graded by difficulty.

4
📊 See the estimate

Watch as it reveals the model's size in billions of parameters, plus strengths across easy-to-rare facts.

5
🔬 Explore details

Dive into the results or quiz specific questions to understand what the model really knows.

Unlock AI insights

Now you know the true knowledge power of any AI, even secret ones, with one simple test.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 44 to 44 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is ikp?

IKP delivers a Python toolkit to estimate black-box LLM parameter counts via accuracy on 1,400 factual probes across 7 obscurity tiers, from universal truths to extreme long-tail knowledge. It solves the mystery of undisclosed model sizes—like GPT-4 or Claude—using a single API budget, with log-linear scaling (R²=0.917) validated on 89 open models up to 1.6T params. Run `ikp_estimate.py` on any OpenAI-compatible endpoint for instant results, complete with tier breakdowns.

Why is it gaining traction?

Unlike synthetic benchmarks, IKP's incompressive knowledge probes reveal true memorized capacity, piercing closed-source black boxes where flops or MMLU fall short. The CLI shines: research mode queries facts or researchers against landmarks/SOTA; eval mode re-scores probes with Gemini judging. Makefile-driven paper reproduction and interactive React site make it dev-friendly for scaling law experiments.

Who should use this?

ML engineers benchmarking rivals without internals, researchers probing ikp tu darmstadt scaling (or ikp stuttgart variants), or teams testing local vLLM deploys. Perfect for ikp test suites comparing ikpeba forks or ikps in Python pipelines.

Verdict

Solid niche for black-box sizing via knowledge probes, with strong calibration and easy CLI—worth a spin despite 44 stars and 1.0% credibility signaling prototype maturity. Polish docs/tests for prod; great for research now.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.