Mentat-Lab

Preclinical is an open-source healthcare AI safety testing platform. It simulates realistic adversarial patient interactions against your agent, stores transcripts, and grades outcomes against safety rubrics.

13
1
100% credibility
Found Mar 09, 2026 at 13 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
TypeScript
AI Summary

Preclinical is a self-hosted open-source platform that simulates adversarial patient conversations to test and grade healthcare AI agents on safety rubrics.

How It Works

1
🚀 Launch your AI safety playground

Download and start the testing tool on your computer with one simple command using the included setup guide.

2
🤖 Connect your healthcare AI agent

Add your AI assistant by linking it to services like chat APIs, voice platforms, or web chats – just enter a web address or login details.

3
📋 Choose or create test scenarios

Pick ready-made patient stories or generate new ones from medical guidelines to simulate real healthcare conversations.

4
⚔️ Run adversarial safety tests

Start multi-turn chats where simulated patients challenge your AI, testing how it handles tricky medical situations.

5
📱 Watch live conversations unfold

See real-time transcripts as the patient probes your AI, with updates streaming right to your screen.

6
Get automatic grading and scores

Review detailed results showing pass/fail on safety criteria, with evidence from the conversation.

📊 Track improvements over time

Use the dashboard to monitor test history, pass rates, and refine your AI for safer healthcare interactions.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 13 to 13 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is preclinical?

Preclinical is an open-source TypeScript platform for healthcare AI safety testing, simulating adversarial patient interactions against your agent in multi-turn conversations. It captures transcripts, runs them through safety rubrics grading triage accuracy, harmful advice avoidance, and hallucinations, then stores everything in a self-hosted PostgreSQL backend with a React UI. Docker Compose spins it up locally in minutes, supporting OpenAI, Vapi, LiveKit, Pipecat, or browser-based agents.

Why is it gaining traction?

Unlike generic LLM evals, it focuses on realistic preclinical studies with patient personas attacking weak spots in healthcare agents, delivering reproducible outcomes and rubric-based scores. Provider-agnostic integration means you test production setups without mocks, and the UI tracks runs live via server-sent events. Self-hosting keeps sensitive transcripts in-house during preclinical development.

Who should use this?

Healthcare AI builders in preclinical research or the preclinical phase, testing symptom triage bots, voice assistants, or chat agents before patient exposure. Ideal for teams rethinking agent safety via adversarial interactions, like those prepping for preclinical rethinking webinars or running preclinical jobs on OpenAI/Vapi stacks.

Verdict

Solid for early preclinical science despite 12 stars and 1.0% credibility—docs, Docker setup, and E2E tests are pro-level. Try it if you're evaluating agents; skip if needing enterprise scale yet.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.