Josh-blythe

Open-source cross-modal and multimodal prompt injection test suite. 38,000+ attack payloads across text, image, document, and audio modalities. Research-backed by OWASP LLM Top 10, CrossInject (ACM MM 2025), FigStep (AAAI 2025), DolphinAttack, and CSA 2026.

12
1
100% credibility
Found Apr 10, 2026 at 12 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

A comprehensive labeled dataset of multimodal prompt injection attacks and benign examples, sourced from academic research, for training detectors to protect AI systems.

How It Works

1
🔍 Discover Bordair Dataset

You hear about a helpful collection of examples that teach AI to spot sneaky tricks hidden in pictures, sounds, or documents.

2
đź“– Read the Guide

You look through the friendly explanation of bad tricks and normal everyday messages, all backed by trusted research.

3
🛠️ Gather Your Examples

With a few easy steps, you create thousands of practice examples of tricks and safe messages to train with.

4
🔄 Balance Good and Bad

You mix equal parts of tricky attacks and harmless chats so your AI learns to tell them apart fairly.

5
đź§  Teach Your AI Guard

You feed the examples into your learning tool, watching it get smarter at catching hidden dangers.

âś… AI Stays Safe

Your AI now confidently spots and blocks tricky inputs across text, images, sounds, and files, keeping conversations secure.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 12 to 12 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is bordair-multimodal-v1?

Bordair-multimodal-v1 is a Python-based open source GitHub tool that generates 61,875 labeled samples—38,000+ attacks plus benign counterparts—across text, image, document, and audio modalities for training prompt injection detectors in multimodal LLMs. It tackles cross-modal attacks like split payloads in images or audio that bypass text-only defenses, backed by OWASP LLM Top 10, CrossInject (ACM MM 2025), and FigStep (AAAI 2025). Run simple generators to output JSON datasets ready for binary classifiers, complete with source attributions.

Why is it gaining traction?

It packs research-grade attacks from 2025/2026 papers into a one-stop dataset, covering everything from OCR image injections to ultrasonic audio exploits and GCG suffixes—far beyond basic text lists. As an open source GitHub alternative to proprietary red-teaming suites, it lets you regenerate fresh payloads on-demand without API keys or costs. Developers dig the 1:1 attack-benign balance and Hugging Face integration for quick training starts.

Who should use this?

AI security researchers benchmarking detectors against cross-modal threats. LLM app devs testing vision-language models for exfiltration via docs or audio. Red-team leads evaluating 2025-era multimodal systems pre-launch.

Verdict

Worth forking for AI security workflows—thorough README makes it plug-and-play despite 12 stars and 1.0% credibility score signaling early maturity. Validate generated payloads yourself; it's raw research data, not battle-tested prod tooling.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.