joshhu / skillopt-qa
PublicMinimal faithful re-implementation of Microsoft SkillOpt: a text-space optimizer that trains a deployable natural-language skill for a frozen LLM agent on HotpotQA.
SkillOpt-QA is a research tool that improves how an AI assistant answers complex questions by training a reusable set of instructions (called a 'skill') rather than modifying the AI model itself. The project downloads a dataset of multi-hop reasoning questions, runs a training loop where an AI attempts questions while another AI reviews mistakes and proposes small instruction improvements, and validates each change against a held-out set. The final output is a simple text file containing optimized instructions that can be attached to any question-answering AI to improve its performance - no model retraining required.
How It Works
You learn about SkillOpt - a technique that teaches an AI assistant to answer complex multi-hop questions better, without changing the AI itself.
You install the tools and connect to an AI service that will power your question-answering assistant.
You download a collection of challenging questions that require piecing together information from multiple sources.
Your AI tries answering questions, then another AI reviews the mistakes and suggests small improvements to the instructions - only improvements that actually work get kept.
Through multiple rounds, the instructions become sharper - each time the AI makes mistakes, the optimizer learns what to fix.
The final result is a single text file containing better instructions that you can give to any question-answering AI to make it smarter.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.