lechmazur

LLM Persuasion Benchmark tests whether one language model can change another model’s stated position over the course of a multi-turn conversation. It runs round-robin persuasion dialogues on contested propositions and measures both persuasive effectiveness and target resistance from stance shifts recorded before and after each exchange.

13
0
100% credibility
Found Mar 28, 2026 at 13 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
AI Summary

This repository hosts a benchmark evaluating large language models' persuasion effectiveness in multi-turn conversations, complete with leaderboards, charts, model profiles, and debate transcripts.

How It Works

1
🔍 Discover the Benchmark

You find this GitHub page sharing rankings of AI models' abilities to persuade each other in debates.

2
📊 View Leaderboards

You check the top lists to see which AIs are best at changing minds and which resist the most.

3
Spot Top Performers

Standout models like GPT and Claude shine as powerful persuaders in the colorful charts.

4
📈 Explore Charts and Matches

You look at graphs showing head-to-head battles and overall offense versus defense strengths.

5
📖 Read Model Stories

You dive into profiles explaining how each AI argues or defends in real conversations.

6
💬 Enjoy Debate Examples

You read fun transcripts of AIs debating topics like city bans or animal reintroduction.

🧠 Master AI Persuasion

You now understand which AIs win arguments and why, ready to pick the best for tough talks.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 13 to 13 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is persuasion?

Persuasion runs an LLM persuasion benchmark pitting one language model against another in multi-turn debates on contested topics like embryo screening or city-center car bans. It measures stance shifts on a seven-point scale via pre- and post-conversation probes, producing leaderboards for persuasion strength, target susceptibility, and pairwise matchups. Developers get charts, model dossiers, transcripts, and quotes revealing real changemyview llm persuasion dynamics, not just one-shot fluency.

Why is it gaining traction?

Unlike basic preference evals, it enforces 8-turn round-robin dialogues across PRO/CON sides of 15 topics, exposing offense-defense splits—like GPT variants dominating persuasion while Grok resists shifts. The pairwise matrices and topic asymmetries offer fresh llm persuasion effectiveness meta-analysis, hooking teams tracking bayesian persuasion llm or llm persuasion study trends. As a github llm-resources staple, it integrates with llm github search for quick model comparisons.

Who should use this?

LLM evals engineers benchmarking debate capabilities for alignment research. AI product leads scouting persuadable models for chat agents or copilots. Researchers replicating changemyview llm persuasion study setups in custom llm github integration pipelines.

Verdict

With 13 stars and 1.0% credibility score, it's early-stage—docs shine via detailed leaderboards and artifacts, but lacks runnable code or tests. Grab it for instant llm persuasion benchmark insights if you're deep in github llm repository evals; skip if needing production-ready tools.

(178 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.