tonbistudio / moe-ssd-streaming-windows
PublicRunning a 32 GB AI model on 28 GB of memory — MoE expert streaming from NVMe SSD on Windows
This repository provides instructions, benchmarks, and tools for running large Mixture-of-Experts AI models on Windows PCs with limited memory by streaming expert weights from an NVMe SSD.
How It Works
You learn about a clever way to run huge AI chatbots on your regular gaming PC, even if it doesn't have tons of memory.
Look at your Windows PC to confirm it has an NVIDIA graphics card, at least 16GB memory, and a speedy solid-state drive.
Get the free AI runner program and a large AI model file, saving them in a simple folder on your drive.
Fire up the AI using a quick command that keeps the main parts in your graphics memory and pulls extra pieces from your fast drive on the fly.
Start asking questions and see the AI respond with smart text at a snappy pace, like 2-4 words per second.
Celebrate running a massive 32GB AI smoothly on your budget setup, stretching your PC's power to the max.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.