voipmonitor / rtx6kpro
PublicRTX 6000 Pro Wiki — Running Large LLMs (Qwen3.5-397B, Kimi-K2.5, GLM-5) on PCIe GPUs without NVLink
Community knowledge base with guides, benchmarks, and optimization tips for running massive language models on clusters of NVIDIA RTX 6000 Pro GPUs via PCIe without NVLink.
How It Works
You hear about this community notebook full of tips for squeezing top speed out of big AI thinkers on powerful graphics cards.
Browse easy pages listing huge models like Qwen or GLM, noting which need 2, 4, or 8 cards for best results.
Read friendly advice on linking cards through computer slots for smooth teamwork without extra cables.
Try proven tweaks like magic number settings that suddenly make everything 50% faster and feel super smooth.
Run simple tests to clock words flying out at hundreds per second, comparing to community scores.
Enjoy lightning-fast AI chats on your setup, joining others to share setups and celebrate breakthroughs.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.