badlogic

Collect, review, and upload redacted pi session files to a Hugging Face dataset

81
4
100% credibility
Found Apr 07, 2026 at 81 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
TypeScript
AI Summary

Command-line tool for safely collecting, redacting secrets from, reviewing with AI, and incrementally publishing coding agent sessions to Hugging Face datasets.

How It Works

1
🔍 Discover safe sharing

You hear about a helpful tool that lets you share your AI coding adventures from open-source projects publicly, while keeping private details hidden.

2
📦 Set up sharing space

In your project folder, you create a private area to prepare and organize your coding sessions for safe sharing.

3
📥 Gather recent sessions

You collect your latest coding sessions, and the tool automatically hides any personal secrets it finds.

4
🤖 AI safety review

A smart helper checks each session to confirm it's focused on your project and free of sensitive info.

5
👀 Inspect and select

You browse the approved sessions, search for anything iffy, and set aside ones you don't want to share.

6
🚀 Publish collection

With one go, you send the safe sessions online to build or update your public dataset.

🎉 Sessions shared safely

Your cleaned coding traces are now public for others to explore and learn from, ready for more anytime.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 81 to 81 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is pi-share-hf?

pi-share-hf is a TypeScript CLI for collecting pi coding agent session files from your OSS GitHub projects, redacting exact secrets from env files or lists, scanning with TruffleHog, running LLM reviews against project context like README.md, and uploading safe redacted files to a Hugging Face dataset. It solves the problem of sharing real agent traces for analysis or training data without leaking credentials or off-topic content. Workspaces keep state incrementally, so reruns only process changes.

Why is it gaining traction?

It stands out with a tight CLI workflow—init a workspace per repo, collect with deny patterns and secrets lists, grep/list uploadable sessions, reject manually, then upload—beating manual redaction or generic dataset tools. Developers hook on the layered safety: deterministic redaction plus TruffleHog backstop and pi-powered LLM checks tuned for OSS context. Incremental processing and HF-native uploads make collecting review datasets from GitHub sessions dead simple.

Who should use this?

OSS maintainers using pi.dev on GitHub repos who want to collect and publish agent session datasets on Hugging Face for benchmarking or fine-tuning. Ideal for teams reviewing coding agent performance across projects, like collecting reviews from client-facing tools or Shopify apps without privacy risks. Skip if you're not running pi sessions or sharing traces.

Verdict

Grab it if pi.dev is in your stack—81 stars and 1.0% credibility reflect its niche maturity, but thorough docs and CLI make it production-ready for dataset collection today. Test on a small repo first; it's lightweight but needs TruffleHog and HF token setup.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.