fpv-labs

Python SDK for Stera: Record, Process, Evaluate, and Export Multimodal Data

24
3
94% credibility
Found May 25, 2026 at 24 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

Stera SDK is a Python software toolkit developed by FPV Labs for processing first-person video recordings. It loads video files containing RGB video, depth maps, camera positions, and sensor data, then runs AI models to track hands, blur faces for privacy, and estimate body pose. The SDK includes visualization tools using Rerun for interactive 3D exploration and generates quality evaluation reports. Finally, it exports everything into organized datasets (video, meshes, annotations, calibrations) ready for machine learning workflows in robotics and AI research. The project is Apache 2.0 licensed and available on PyPI.

How It Works

1
📹 Record first-person video

You capture video footage using the Stera mobile app on your device, collecting rich data with camera motion, hand movements, and 3D depth information.

2
📂 Load your recording into the SDK

You open your video file with the SDK, and it automatically organizes all the data streams—RGB video, depth maps, camera positions, and sensor readings—into one tidy structure.

3
🖐️ AI automatically spots your hands

The system runs a hand-tracking model that identifies all 21 joints of each hand in every frame, giving you precise 3D positions of fingers and wrists in real space.

4
🔒 Faces get blurred automatically

When privacy matters, an AI face detector finds and blurs any people in the footage with a single command, keeping everything clean and professional.

5
See your data come alive
🎮
3D Scene

Watch your recorded journey unfold as a 3D visualization with your camera position, hand movements, and the environment mesh all together.

📊
Quality Report

Get an interactive web page showing statistics on data quality, hand detection rates, camera smoothness, and any issues to watch for.

6
🎁 Package everything for training

With one command, all your data exports into a neat folder containing video, 3D mesh, hand tracking annotations, and calibration details—ready for machine learning pipelines.

Your data is ready for AI

You have clean, annotated first-person video data structured perfectly for training embodied AI systems, robotics models, or vision-language systems.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 24 to 24 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is stera-sdk?

Stera-sdk is a Python library for processing first-person video recordings. It loads data from the Stera mobile app (stored in MCAP format, the ROS bag successor), runs computer vision models on the footage, and exports clean datasets ready for training embodied AI systems. The SDK handles the messy real-world pipeline: synchronized RGB and depth frames, camera poses, IMU data, and 3D meshes from SLAM. You can plug in different hand trackers (MediaPipe, WiLoR, HaMeR), blur faces for privacy, and visualize everything through an interactive HTML report. The export function bundles everything into a standardized episode folder with video, annotations, calibrations, and meshes.

Why is it gaining traction?

The embodied AI space is exploding, and researchers need high-quality first-person data pipelines. This SDK solves the hardest part: getting from raw recordings to structured, annotated datasets. The swappable model backends are a big win—you can start with zero-setup MediaPipe and swap in WiLoR or HaMeR when you need better accuracy. The evaluation module generates detailed HTML reports with trajectory plots, IMU analysis, and hand detection timelines. That kind of QC tooling is usually scattered across five different scripts.

Who should use this?

Robotics researchers building VLAs or world models will get the most value. If you're ingesting first-person video for manipulation or navigation tasks, this handles the data wrangling. Computer vision engineers evaluating hand tracking models will appreciate the benchmark-ready export format. FPV content creators who want automated face blurring without stitching together separate tools will find it useful too. It's not for casual users—expect to read the docs and understand MCAP or ROS concepts.

Verdict

Stera-sdk fills a real gap in the embodied AI tooling landscape, but it's early-stage (24 stars, v0.0.4). The architecture is solid and the feature set is comprehensive, but test coverage and community support are unknowns. The 0.95% credibility score reflects a small but active project from FPV Labs. Worth evaluating for serious data pipeline work, but budget time for digging into docs and potentially filing issues.

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.