ZengboJamesWang

Run Qwen3.5-35B-A3B with llama.cpp and openclaw on NVIDIA DGX Spark (GB10)

14
2
100% credibility
Found Mar 04, 2026 at 14 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Shell
AI Summary

This repository offers setup scripts and a compatibility bridge to run a powerful AI model locally on NVIDIA DGX Spark hardware for seamless use as an intelligent agent in openclaw.

How It Works

1
🔍 Discover local AI power

You find this guide promising a super-smart AI running right on your powerful DGX Spark computer, fast and private.

2
📦 Run the easy installer

You follow the simple steps to download and set up the AI brain using the provided helper script.

3
🚀 Start the AI helper

Everything clicks into place as you launch the setup script, waking up your local AI with blazing speed.

4
🔗 Link to your AI chat app

You add a quick note to your chat app's settings so it knows where to find your new AI friend.

5
💬 Chat with your genius

You start asking questions, using tools for searches or code help, feeling the power of instant smart replies.

6
🧠 Unlock deep thinking

For tricky problems, you add a special word to get the AI's full step-by-step reasoning magic.

AI agent ready

Now you have a fully capable, lightning-fast AI companion on your machine for coding, math, or any challenge.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 14 to 14 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is Qwen3.5-35B-A3B-openclaw-dgx-spark?

This Shell-based repo lets you run Qwen3.5-35B-A3B, a 35B MoE model, on NVIDIA DGX Spark GB10 hardware using llama.cpp, with full openclaw integration for AI agents. It solves compatibility headaches between llama-server and openclaw—rewriting message roles and controlling reasoning modes—delivering ~43 tok/s generation, 128k context, and tool calls out of the box. Run install scripts to build everything, start services, and plug into openclaw via a local endpoint.

Why is it gaining traction?

It nails local runs of a high-end Qwen3.5-35B-A3B model on DGX's 122GB unified memory, offloading ~20GB for the model while leaving room for massive contexts—far snappier than cloud alternatives for single-user workflows. The [think] prefix toggles on-demand reasoning without config tweaks, and systemd services keep llama.cpp and the proxy humming in the background. Devs dig the automated setup for NVIDIA GB10, skipping manual CUDA builds and proxy hacks.

Who should use this?

AI agent builders on DGX Spark who want openclaw-powered tool calls and reasoning without latency or costs. Local LLM tinkerers testing Qwen3.5-35B-A3B for code debugging, math, or architecture decisions. Teams running github workflows locally or experimenting with copilot-like agents on beefy NVIDIA hardware.

Verdict

Grab it if you own DGX Spark and need Qwen3.5-35B-A3B in openclaw today—docs are solid, setup is two sudo commands. At 14 stars and 1.0% credibility, it's early and hardware-specific; fork and contribute if you scale it beyond GB10.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.