mikefutia

Claude Vision Skill (Mike Futia | SCALE AI)

44
12
100% credibility
Found May 11, 2026 at 44 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

This repository provides a Claude Code skill that uses Google's Gemini to analyze uploaded videos and generate structured markdown reports with summaries, scene breakdowns, transcripts, visual details, and key moments.

How It Works

1
🔍 Discover the skill

You find Claude Vision, a handy add-on that lets your AI assistant Claude watch and break down videos like ads, meetings, or tutorials.

2
📥 Grab the files

Download the skill files and place them in your Claude skills folder, naming the folder video-analyzer so Claude can spot it.

3
🔗 Connect free video service

Sign up for a free account at Google AI Studio, get your access code, and ask Claude to set it up so it's ready everywhere.

4
🛠️ Add the video helper

Run one simple command on your computer to install the tool that makes video analysis possible.

5
🎥 Analyze your video

In Claude Code, simply tell it to use the video-analyzer skill on any video file, like a screen recording or demo.

📊 Get your full report

Claude delivers a clear markdown report with a summary, scene-by-scene timestamps, audio notes, visual details, and key moments—all accurate and hallucination-free.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 44 to 44 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is claude-vision?

Claude-vision is a Python-based Claude GitHub skills integration that lets Claude Code analyze videos by routing them through Google's Gemini API. Drop in any supported video file—like MP4s, MOVs, or screen recordings—and it spits out a structured markdown report with a top-level summary, timestamped scene breakdowns, audio transcripts, visual details, and key moments. It solves the problem of giving Claude vision capabilities for real-world tasks like ad reviews or meeting recaps, using a free Gemini API key with generous limits and no complex setup beyond pip install google-genai.

Why is it gaining traction?

This Claude GitHub plugin stands out with ironclad anti-hallucination rules that prevent fabricating speakers or details, delivering reliable Claude vision OCR, object detection, and scene analysis you can trust. Developers hook it into Claude Code via a simple /video-analyzer command with flags for custom prompts, FPS sampling, or Gemini models like gemini-2.5-flash, making Claude vision MCP workflows seamless without building from scratch. The structured output turns raw video into actionable insights fast, beating generic Claude vision API calls.

Who should use this?

Performance marketers tearing down UGC ads or competitor videos for beat-by-beat insights. Devs converting Loom tutorials or demos into step-by-step SOPs and notes. Product teams recapping meetings or screen recordings to extract action items, especially those already in Claude GitHub Copilot or integration flows.

Verdict

Grab it if you need quick Claude video vision GitHub connector for ad analysis or recaps—solid docs and MIT license make it dead simple to drop into ~/.claude/skills. With just 44 stars and a 1.0% credibility score, it's early-stage and unproven at scale, so test on non-critical videos first before production reliance.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.