tue-mps

tue-mps / pmt

Public

[CVPR 2026 Workshop] Official code and models for Plain Mask Transformer (PMT).

18
0
100% credibility
Found Mar 30, 2026 at 18 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Jupyter Notebook
AI Summary

PMT is a research tool for training models to segment objects in images and videos while keeping the core image understanding part unchanged for reuse across tasks.

How It Works

1
📖 Discover PMT

You find this clever tool for outlining objects in photos and videos without messing with the main image analyzer.

2
🛠️ Set up your workspace

You prepare your computer by installing the needed software so everything runs smoothly.

3
🖼️ Gather your images

You collect pictures along with labels showing where objects like people or cars are located.

4
🚀 Start teaching it

You choose a ready recipe and launch the learning process, watching your model get smarter with each batch of images.

5
📈 Track the progress

You check colorful charts and logs to see how much better it's getting at spotting and outlining things.

6
🧪 Test on new pictures

You try it out on fresh images to measure how accurately it draws boundaries around objects.

Perfect outlines achieved

Your model now flawlessly segments everyday scenes or videos, ready for your projects or research.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 18 to 18 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is pmt?

PMT delivers fast image segmentation—semantic, instance, panoptic—on frozen vision foundation models like DINOv3, processing COCO and ADE20K datasets without touching encoder weights. You train or eval via simple PyTorch Lightning CLI commands, like `python image/main.py fit -c configs/coco/panoptic/pmt_l_640.yaml`, hitting competitive benchmarks in hours on H100s. Video segmentation lands soon; it's the official code for a CVPR 2026 workshop paper on github cvpr 2026 papers.

Why is it gaining traction?

Stands out by matching finetuned models' accuracy/speed while keeping encoders frozen and shareable across tasks—no more task-locked VFMs. Lightweight decoder slashes VRAM (26GB/GPU at batch 16), with annealing schedules and wandb logging for quick experiments. Buzz from CVPR 2026 reddit threads and cvpr 2026 accepted papers seekers digging frozen encoders post-cvpr 2024 papers github.

Who should use this?

CV/ML researchers benchmarking segmentation on COCO/ADE20K, especially those prepping CVPR 2026 deadline submissions or workshops with foundation models. Suits uni labs like TU/e prototyping multi-task pipelines without full finetuning hassles.

Verdict

Solid academic drop with strong docs and configs, but 18 stars and 1.0% credibility score signal early days—video pending, no tests. Worth forking for CVPR 2026 template cvpr github template users; hold if needing production polish.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.