mai-yyy

一个让 Claude Code 调用 Codex 干活,并可以同时调用多个模型(GPT、Kimi、DeepSeek 等)的 MCP 工具。

18
0
69% credibility
Found May 30, 2026 at 18 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

This is an open-source tool that extends Claude Code by letting it simultaneously call multiple AI models (GPT, Kimi, DeepSeek, Qwen, and Claude) and control Codex CLI for coding tasks. Users can ask a single question to multiple AI services at once and compare their answers, have multiple models review code together, or delegate coding tasks to Codex CLI. The tool manages conversation history and handles long-running tasks by returning job IDs that can be checked later. It's designed for developers who want to leverage different AI models' strengths in one workflow, with clear documentation and MIT licensing.

How It Works

1
💡 You want your AI coding assistant to do more

You're using Claude Code and wish it could compare answers from different AI models or run coding tasks automatically.

2
🔌 You connect this tool to Claude Code

You add this MCP service to Claude Code so it can access multiple AI models and Codex CLI in one place.

3
🌐 Your AI assistant gains superpowers

Claude Code can now ask GPT, Kimi, DeepSeek, Qwen, and even Claude simultaneously, getting multiple perspectives on any question.

4
You choose what to do next
🤖
Compare AI models

Send one question to GPT, Kimi, DeepSeek, and Qwen at the same time and see all their answers side by side.

🔍
Have AI review your code

Ask multiple models to review the same piece of code and give their opinions on bugs or improvements.

⚙️
Let Codex handle tasks

Ask Codex CLI to view files, modify code, or refactor your project while you wait.

5
Long tasks keep running in the background

If a task takes a while, the tool returns a job ID and keeps working while you continue your conversation.

6
🔄 You check back when ready

You can ask the tool to wait for results or check on the status whenever you want, without losing context.

🎉 You get comprehensive results

You receive answers from multiple AI models, Codex task results, or code reviews—all organized and ready to use.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 18 to 18 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is multi-llm-mcp?

This is a Python-based MCP server that extends Claude Code with two powerful capabilities. First, it lets Claude Code delegate tasks to OpenAI's Codex CLI for code viewing, editing, and project restructuring. Second, it enables simultaneous calls to multiple AI models (DeepSeek, Kimi, Qwen, GPT, and optionally Claude) so you can compare answers side-by-side. The tool handles long-running tasks by returning a job ID upfront, which you then poll with a separate wait tool to avoid MCP timeouts.

Why is it gaining traction?

The killer feature is parallel multi-model querying. Instead of switching between models manually, you broadcast a single prompt and get responses from multiple providers at once. This is useful for cross-validation, exploring different reasoning approaches, or simply picking the best answer. The async job pattern is also clever—it sidesteps Claude Code's MCP timeout limits by splitting work into spawn and wait phases. The project supports Chinese-language models like Kimi and Qwen alongside Western options, which is rare in this space.

Who should use this?

Developers who want to compare AI responses without context-switching. Researchers validating code changes across multiple models. Teams using Claude Code but wanting access to Codex CLI capabilities. Anyone running multiple LLM providers and tired of juggling separate tools. Not ideal for production deployments yet given the low star count and limited documentation.

Verdict

A genuinely useful concept with solid implementation, but early-stage. The 0.699999988079071% credibility score reflects a small, new project with minimal community validation. Test coverage exists but the project lacks polished docs and real-world battle-testing. Worth trying for personal use or experimentation, but wait for more maturity before betting on it for critical workflows.

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.