Yuanhong-Zheng

Official code for PEARL: Personalized Streaming Video Understanding Model

40
2
100% credibility
Found Mar 28, 2026 at 40 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

PEARL is an evaluation framework for testing AI models on personalized video understanding, using scene clips and custom concepts to answer questions about streaming videos.

How It Works

1
📹 Discover PEARL

You find a helpful tool that makes AI understand your personal videos, like spotting family moments or favorite actions.

2
📥 Gather your videos

Download sample home videos and question lists to see how well it recognizes special people or activities in them.

3
✂️ Break into clips

The tool automatically chops long videos into short, easy-to-watch scenes so the AI can focus better.

4
🤖 Wake the AI helpers

Start the smart assistants that learn your custom terms like 'Grandma cooking' by looking at video moments.

5
Ask about your videos

Pose fun questions like 'When did {dog} play fetch?' and watch the AI find exact moments.

🎉 Unlock video magic

Celebrate as you get clear answers and scores showing how well it knows your personal video world.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 40 to 40 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is PEARL?

PEARL equips vision-language models for personalized streaming video understanding: define custom concepts like "my coworker in blue," and it localizes them by timestamp while fielding multi-turn queries on live feeds. Python scripts handle scene splitting, concept memory, clip retrieval, and multi-GPU inference via Qwen3-VL servers. Users get eval metrics on PEARL-Bench, no model fine-tuning needed.

Why is it gaining traction?

Zero-shot personalization on endless streams sets it apart from static video QA tools—retrieves relevant clips via embeddings, boosting recall without retraining. Devs like the plug-and-play servers and bash scripts for 8-GPU evals, mirroring official github actions efficiency. Outshines generic VLMs on dynamic concepts, much like official github cli streamlines workflows.

Who should use this?

ML researchers benchmarking video tasks beyond official code geass art analysis. Devs building surveillance apps needing "spot {specific person} at time T." Teams tweaking Qwen VLMs for content moderation or personalized recs, skipping boilerplate like official codes roblox handlers.

Verdict

Solid arXiv-backed prototype (40 stars) for PSVU experiments, but 1.0% credibility and pending video-level eval flag immaturity—docs clear, scripts battle-tested. Fork if streaming video hooks you; monitor official github releases page amid low traction.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.