uncase-ai / UNCASE

Public

Open-source framework for turning expert knowledge into PII-free synthetic conversational data and production-ready LoRA adapters.

uncase.md compliance dataset-generation fastapi fine-tuning gdpr

100% credibility

Found Feb 27, 2026 at 47 stars -- GitGems finds repos before they trend. Get early access to the next one.

AI Analysis

Python

AI Summary

UNCASE is an open-source framework that generates privacy-safe synthetic conversational data for fine-tuning language models in regulated industries like healthcare and finance.

How It Works

💡 Discover safe AI training

You need custom chat AI for healthcare or finance but can't risk real customer data due to privacy rules.

🔍 Find UNCASE

This free tool creates realistic fake conversations to train AI without exposing sensitive info.

🏠 Set up your workspace

Create a free account and connect an AI thinking service so everything works smoothly.

📋 Design conversation blueprints

Outline simple chat patterns like customer questions and advisor replies for your industry.

📚 Add real-world facts

Upload guides and manuals so generated chats use accurate industry details.

✨ Generate practice conversations

Click generate to create hundreds of lifelike synthetic chats based on your blueprints.

✅ Check quality automatically

Review scores for realism, privacy safety, and variety to ensure perfect training data.

🚀 Export ready training data

Download certified datasets to fine-tune your custom AI safely and effectively.

Sign up to see the full architecture

6 more

Star Growth

See how this repo grew from 47 to 52 stars Sign Up Free

Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose

AI-Generated Review

What is UNCASE?

UNCASE is an open-source framework in Python that converts expert knowledge into PII-free synthetic conversational data for fine-tuning LoRAs and QLoRAs, targeting regulated industries like healthcare and finance. You feed it anonymized conversation blueprints called seeds—structured JSON outlining roles, steps, and domains—and it runs a full pipeline: scanning for privacy risks, generating dialogues via any LLM provider (Claude, GPT, Ollama), evaluating quality across six metrics, and exporting to 10 chat formats like ChatML or Alpaca. Spin it up with one Docker Compose command for API, dashboard, Postgres, and Redis.

Why is it gaining traction?

Unlike generic synthetic data tools that spit out low-fidelity noise, UNCASE enforces zero-PII policies with regex and NER scans, plus traceable seeds for reproducible, auditable outputs regulators can inspect. Developers love the batteries-included Docker profiles (GPU, MLflow, observability), 75+ API endpoints for scripting, and Next.js dashboard for monitoring jobs, costs, and evals—no glue code needed. As an LLM open-source framework on GitHub, it slots into github open source tools lists for ai agents framework open source workflows.

Who should use this?

ML engineers at banks, hospitals, or legal firms building domain-specific chat agents without risking PHI leaks. Teams fine-tuning open models on proprietary convos who need quick prototypes via CLI (`uncase template export`) or self-hosted pipelines. Avoid if you're doing raw RAG— this is for structured synthetic data gen.

Verdict

Worth forking for regulated fine-tuning pilots; solid Docker/docs (970 tests, 73% coverage) despite 53 stars and 1.0% credibility signaling early alpha. Production? Wait for more battle scars, but great open-source framework starter.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.

Stars

Forks

Followers

Base stars: 52 stars

Penalty: New account (5d): -70%

Penalty: New account with popular repo: -90%

Bonus: AI verified quality (100%)

Account age: 5 days

Repo age: 8 days

Updated: Mar 03, 2026