AEON-7 / vllm-dflash
PublicDFlash vLLM for DGX Spark — Plug & Play Block-Diffusion Speculative Decoding
This repository offers a ready-to-run container for serving a fast, uncensored 27B AI model with image understanding on NVIDIA DGX Spark hardware using advanced speed techniques.
How It Works
You learn about a simple way to make your DGX Spark AI supercomputer give lightning-quick answers to questions and describe pictures.
Grab the ready-optimized uncensored AI model from a trusted sharing site to your computer.
Write a short note with the model's location and a private password to keep everything secure.
Hit go with an easy start command, and watch as your personal fast AI server springs to life.
Talk to it like a friend by sending questions or pictures, getting smart replies right away.
Feel the thrill of 2-3 times faster responses, turning slow waits into smooth, fun conversations.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.