cool-japan / oxicuda

Public

OxiCUDA replaces the entire NVIDIA CUDA Toolkit software stack with type-safe, memory-safe Rust code.

github.comcool-japanoxicuda cuda cuda-programming pure-rust rust rust-lang

100% credibility

Found Apr 14, 2026 at 26 stars -- GitGems finds repos before they trend. Get early access to the next one.

AI Analysis

Rust

AI Summary

OxiCUDA is a pure Rust library that replaces the NVIDIA CUDA toolkit, enabling high-performance GPU computing for linear algebra, deep learning, FFT, sparse operations, and more without needing the CUDA SDK or compiler.

How It Works

👀 Discover fast GPU math

You hear about OxiCUDA, a simple way to make your number-crunching programs run super fast on your NVIDIA graphics card without extra software installs.

📥 Add to your project

You grab it with one easy step and connect it to your math or machine learning work.

🎯 Pick your graphics card

Your program finds and picks the best graphics card on your computer automatically.

✍️ Write familiar math code

You use everyday math functions like matrix multiply or Fourier transforms, just like on your regular computer but way faster.

🚀 Run and feel the speed

Hit go, and watch your heavy calculations zoom thanks to your graphics card doing the heavy lifting.

✅ Enjoy blazing results

Your simulations, AI training, or data analysis now finish in minutes instead of hours, ready for real-world use.

Sign up to see the full architecture

4 more

Star Growth

See how this repo grew from 26 to 26 stars Sign Up Free

Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose

AI-Generated Review

What is oxicuda?

OxiCUDA replaces the entire NVIDIA CUDA toolkit software stack with type-safe, memory-safe Rust code, letting you run GPU compute without nvcc, C++ toolchains, or SDK installs—just the NVIDIA driver at runtime. Developers get cuBLAS, cuDNN, cuFFT, cuSPARSE, cuSOLVER, and more in a single Rust crate, with PTX generation and autotuning for near-peak performance on Turing-to-Blackwell GPUs. It's a drop-in for high-perf linear algebra, deep learning ops, FFTs, and inference pipelines.

Why is it gaining traction?

No build-time CUDA deps means faster compiles and easier CI/CD, while Rust's safety eliminates segfaults and races common in CUDA C++. Autotuning hits 90-95% of native perf across precisions (FP8 to FP64), and multi-backend support (Vulkan, Metal, WebGPU, ROCm) adds portability beyond NVIDIA hardware. Early adopters praise the pure-Rust policy and 7k+ tests for reliable GPU kernels.

Who should use this?

Rust ML engineers training transformers or running ONNX inference on NVIDIA GPUs, scientific coders needing BLAS/FFT/sparse solvers without SDK headaches, and backend devs building cross-platform compute (e.g., WebGPU for browsers). Ideal for teams ditching CUDA's toolchain lock-in.

Verdict

Promising foundation for CUDA-free GPU dev, but 26 stars and 1.0% credibility signal early-stage—v0.1.1 lacks polished docs and prod benchmarks. Experiment if you're Rust-native; wait for v1.0 if stability matters. (187 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.

Stars

Forks

123

Followers

Base stars: 26 stars

Bonus: AI verified quality (100%)

Account age: 1,956 days

Repo age: 6 days

License: Apache-2.0

Updated: Apr 14, 2026