Tristan0318 / FraudBench

Public

[TBD] Official Repository for "FraudBench: A Multimodal Benchmark for Detecting AI-Generated Fraudulent Refund Evidence"

tristan0318.github.ioFraudBench

100% credibility

Found May 14, 2026 at 17 stars -- GitGems finds repos before they trend. Get early access to the next one.

AI Analysis

Python

AI Summary

FraudBench provides a dataset of real and AI-generated damaged product images along with tools to evaluate how well AI models and humans detect fraudulent refund evidence.

How It Works

📚 Discover FraudBench

You stumble upon this research project through a paper or online demo, curious about how AI spots fake damage photos in refund scams.

🖼️ Download image collection

Grab the ready-made set of real damaged items and AI-made fakes from various shopping categories like beauty or electronics.

🔗 Link up AI helpers

Connect a few smart AI services so they can examine the images for you—no tech skills needed.

🚀 Launch detection tests

Fire up simple runs to see how AIs perform on single photos or groups, with or without shopper reviews.

Explore extras

🧪

Deep dive studies

Run focused tests tweaking hints or mixing up reviews to uncover AI weaknesses.

👀

Human check app

Open a simple web page to rate images as real or fake, like a fun quiz.

📈 Crunch the numbers

Generate easy charts showing hit rates, confidences, and comparisons across tests.

🎉 Unlock insights

Celebrate with clear reports on how well AIs catch fraud, ready to share or build better safeguards.

Sign up to see the full architecture

5 more

Star Growth

See how this repo grew from 17 to 17 stars Sign Up Free

Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose

AI-Generated Review

What is FraudBench?

FraudBench is a Python-based multimodal benchmark for detecting AI-generated fraudulent refund evidence in e-commerce. It pairs 822 real-world review samples with 7,928 images across 29 categories like electronics and beauty, where fake "damaged goods" photos are synthesized via six state-of-the-art image editing models. Users run API-driven evals on 11 MLLMs (Qwen, Grok, Gemini, GPT variants), specialized detectors, and a blind human eval web app, outputting JSON results and Excel metrics across single/multi-image conditions with or without review text.

Why is it gaining traction?

Unlike generic AI detection benchmarks, FraudBench simulates realistic refund scams with paired text-image fraud, testing MLLMs under transaction pressures like multi-step image delivery. The official repository streamlines runs via simple bash scripts—no local GPUs needed, just API keys—making it dead simple for reproducible ablations on prompt sensitivity or review mismatches. As AI fraud rises in platforms like Amazon, devs grab it for quick baselines vs. tbd GitHub flows.

Who should use this?

AI safety researchers benchmarking MLLMs on multimodal fraud detection. E-commerce teams building refund safeguards against AI-edited "damaged item" claims. ML engineers evaluating VLMs in noisy, real-world settings like food delivery or travel disputes.

Verdict

Grab it if you're in AI fraud detection—solid academic benchmark with easy Python scripts and HF dataset, despite 17 stars and 1.0% credibility signaling early-stage polish. Docs are thorough, but expect research-oriented tweaks over production-ready tooling.

(187 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.

Stars

Forks

Followers

Base stars: 17 stars

Penalty: Very new repo (1d): -70%

Bonus: AI verified quality (100%)

Account age: 1,869 days

Repo age: 1 days

Updated: May 14, 2026