JoeYing1019 / ODE
PublicImplementation for: Towards On-Policy Data Evolution for Visual-Native Multimodal Deep Search Agents
A research framework for training multimodal AI agents to perform deep searches combining text and visual tools using supervised fine-tuning and reinforcement learning.
How It Works
You learn about ODE, a way to train AI helpers that search the web, images, and papers using both words and pictures to find answers.
Download the tools and set up your computer so your AI can use free search services and connect to smart thinkers.
Gather stories of good searches with images and answers to teach your AI how to explore.
Run a quick lesson where your AI learns to use tools like zooming images or visiting sites from your examples.
Fine-tune your AI with live practice and rewards for finding the best evidence and answers.
Challenge it on tough questions and see how well it searches and reasons with visuals.
Now you have a powerful visual search helper that gathers evidence from everywhere to solve complex problems!
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.