noonghunna / qwen36-dual-3090
PublicQwen3.6-27B on dual RTX 3090 β TP=2 recipe, vLLM nightly, MTP + fp8 KV, validated for concurrent serving
Set of ready-to-launch configurations to run a large AI language model on dual RTX 3090 graphics cards for fast local serving with full features like long context, vision, and tools.
How It Works
You search online for ways to run a powerful AI helper on your home computer with two special graphics cards.
Make sure you have two RTX 3090 graphics cards, enough storage, and the right software basics like Docker.
Run a simple setup command to grab the AI model and preparation files safely onto your computer.
Start the server with one easy command and watch it boot up, using both graphics cards together for speed.
Great all-around speed with pictures and long talks.
More chats happening together without slowing down.
Lightning replies especially for writing programs.
Send questions through a simple web link, like asking about France's capital, and get smart answers instantly.
Now you have a fast, smart assistant at home that handles long stories, images, tools, and multiple users β all private and free from cloud costs.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.