r0b0tlab / minimax-m27-nvfp4-gb10-benchmark
PublicMiniMax M2.7 NVFP4 dual-GB10 Blackwell benchmark: vLLM FlashInfer-CUTLASS, public data, HTML canvas report, and Docker runtime.
This is a benchmark and reproducibility project for running the MiniMax M2.7 AI model with NVIDIA's special 4-bit precision format on dual NVIDIA GB10 Blackwell GPUs. The project provides a Docker container, optimized launch scripts, and benchmark results showing around 25 tokens per second performance. It includes a safety checker script to ensure no secrets are accidentally published, and the code is released under MIT license. The project is aimed at researchers and developers who want to run this specific AI model on high-end NVIDIA hardware.
How It Works
You discover a project that shows how to run the MiniMax AI model at high speed on powerful NVIDIA GPUs.
The project tells you that you need dual NVIDIA GB10 GPUs (like a DGX Spark) with enough memory to hold the model.
You download the official MiniMax model from NVIDIA (after accepting their license) and pull the ready-to-run container.
With one script, your AI assistant starts up using special 4-bit precision that fits in your GPUs and runs incredibly fast.
The benchmark shows your setup running at about 25 tokens per second, slightly faster than the public baseline.
Your MiniMax assistant is now running on your Blackwell GPUs, ready to help with reasoning and tool-calling tasks.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.