taldatech

taldatech / lpwm

Public

[ICLR 2026 Oral] Latent Particle World Models official repository

55
2
100% credibility
Found Mar 09, 2026 at 47 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Jupyter Notebook
AI Summary

PyTorch codebase for training world models that decompose videos into object particles and predict dynamics without labels.

How It Works

1
🔍 Discover LPWM

You find this tool on GitHub that magically breaks down videos into moving objects and predicts what happens next, like understanding a busy street scene.

2
📦 Set up easily

Follow simple instructions to install everything you need on your computer, like creating a ready-to-use workspace.

3
📹 Pick your videos

Choose sample video clips of balls bouncing, robots moving, or games, or add your own short clips.

4
🚀 Launch and watch magic

Click to run and see it automatically spot objects, track their paths, and guess future frames perfectly.

5
Train or use ready models
🎯
Use pretrained

Load ready models and generate predictions right away.

🔄
Train your own

Teach it your videos and watch it improve over time.

6
📊 Check cool visuals

View animations of predicted futures, object masks, and paths overlayed on videos.

Unlock video insights

Now you can predict motions, understand scenes, or plan robot actions effortlessly.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 47 to 55 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is lpwm?

LPWM trains self-supervised world models on video data to decompose scenes into object-centric "particles" with keypoints, masks, and stochastic dynamics—no labels needed. Built in PyTorch, it predicts future frames, generates videos, and conditions on actions, language instructions, or image goals for tasks like planning. Users get pretrained checkpoints for datasets like BAIR or Bridge, plus tools to evaluate FVD/PSNR and run inference via simple scripts or Colab.

Why is it gaining traction?

As an ICLR 2026 oral (check OpenReview and GitHub buzz), it hits SOTA on real-world multi-object videos where alternatives struggle with occlusions or dynamics. Devs love the quickstart Colab, multi-GPU Accelerate support, and pretrained models downloadable from Mega—video gen in minutes without setup hell. Reddit threads on ICLR 2026 papers highlight its edge in decision-making like goal-conditioned imitation.

Who should use this?

ML researchers tackling unsupervised object discovery in videos, like video prediction benchmarks (Sketchy, PHYRE). RL engineers building world models for multi-object robotics or games (Mario, PandaGym). Anyone prototyping language-conditioned agents from BridgeData, skipping manual segmentation.

Verdict

Grab it if you're in object-centric learning—strong docs, notebooks, and ICLR 2026 pedigree make experimentation easy. Low 45 stars and 1.0% credibility score mean it's fresh; test pretrained models first before heavy training.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.