msu-video-group

CrowdSAL: Crowdsourced Video Saliency Prediction Dataset and Benchmark

11
0
100% credibility
Found Apr 12, 2026 at 11 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

CrowdSAL provides the largest dataset of videos with crowd-sourced eye fixation data for training and testing attention prediction models, along with a benchmark tool to measure prediction accuracy.

How It Works

1
πŸ” Discover CrowdSAL

You hear about CrowdSAL, the biggest collection of real videos with maps showing exactly where thousands of people's eyes naturally focus while watching.

2
πŸ“₯ Download the collection

Grab the videos, eye-focus maps, tracking details, and info files from the easy links like Google Drive or Hugging Face.

3
πŸ–₯️ Prepare your predictions

Put your own videos predicting where eyes should look into a folder, matching the collection's video names.

4
πŸš€ Run the checker

Start the simple evaluation tool that compares your eye predictions against the real crowd eye data from thousands of viewers.

5
⏳ Watch it process

Sit back as it breaks down every frame of every video, calculating how well your predictions match real attention.

πŸ“Š Get your scores

Receive clear scores like similarity and accuracy ratings, so you know exactly how good your eye-tracking guesses are!

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 11 to 11 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is CrowdSAL?

CrowdSAL delivers the largest crowdsourced video saliency dataset for training and benchmarking prediction models, packing 5000 FullHD videos from YouTube and Vimeo with audio tracks and mouse fixations from over 19,000 observers across 2.7 million frames. Developers grab the dataset via Hugging Face or Google Drive, then use its Python benchmark tool to evaluate saliency maps against ground truth using metrics like NSS, CC, similarity, and AUC-Judd. It solves the scarcity of diverse, high-quality video saliency data for attention modeling.

Why is it gaining traction?

Unlike smaller saliency benchmarks, CrowdSAL stands out with its scale, per-frame fixations, audio integration, and CC-BY license, making it plug-and-play for crowdsourced video prediction experiments. The CLI-driven eval extracts frames via FFmpeg and parallelizes scoring, spitting out JSON results fast on test splits. Python deps are minimal, hooking CV researchers tired of toy datasets.

Who should use this?

Computer vision engineers benchmarking video saliency models for AR/VR gaze prediction or ad optimization. ML teams fine-tuning transformers on real observer data from shorts and long-form content. Anyone crowdsourcing attention maps without building fixation pipelines from scratch.

Verdict

Grab it if video saliency is your jamβ€”solid dataset and benchmark despite 11 stars and 1.0% credibility signaling early maturity with basic docs. Skip for production unless you contribute to grow it.

(178 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.