maim010 / openclaw-video-vision
PublicAI-powered video understanding — extract key frames from YouTube, Bilibili & any video page, get structured summaries via vision AI. Supports yt-dlp, Playwright, cloud browsers. AI驱动的视频理解-从YouTube, Bilibili和任何视频页面提取关键帧,通过VLM获得结构化摘要。支持yt-dlp、Playwright和一些常见云浏览器。
This open-source project is a tool for understanding videos from platforms like YouTube and Bilibili by pulling out key frames and using AI to generate structured summaries including key moments, topics, and timestamps.
How It Works
You find a video on YouTube or Bilibili that looks great but is too long to watch fully.
You grab this simple tool onto your computer to make sense of videos without watching them all.
You link up an AI service that can look at images and figure out what's happening in videos.
You paste the video's web link into the tool, adding any private access info if the video is restricted.
The tool plays the video quietly and saves important still pictures from different moments.
The smart helper studies the pictures and creates an easy-to-read breakdown of the video.
You receive a clear report with the main summary, standout moments with times, and key topics – perfect for quick learning!
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.