cool-japan

cool-japan / oxicuda

Public

OxiCUDA replaces the entire NVIDIA CUDA Toolkit software stack with type-safe, memory-safe Rust code.

26
2
100% credibility
Found Apr 14, 2026 at 26 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Rust
AI Summary

OxiCUDA is a pure Rust library that replaces the NVIDIA CUDA toolkit, enabling high-performance GPU computing for linear algebra, deep learning, FFT, sparse operations, and more without needing the CUDA SDK or compiler.

How It Works

1
👀 Discover fast GPU math

You hear about OxiCUDA, a simple way to make your number-crunching programs run super fast on your NVIDIA graphics card without extra software installs.

2
📥 Add to your project

You grab it with one easy step and connect it to your math or machine learning work.

3
🎯 Pick your graphics card

Your program finds and picks the best graphics card on your computer automatically.

4
✍️ Write familiar math code

You use everyday math functions like matrix multiply or Fourier transforms, just like on your regular computer but way faster.

5
🚀 Run and feel the speed

Hit go, and watch your heavy calculations zoom thanks to your graphics card doing the heavy lifting.

Enjoy blazing results

Your simulations, AI training, or data analysis now finish in minutes instead of hours, ready for real-world use.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 26 to 26 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is oxicuda?

OxiCUDA replaces the entire NVIDIA CUDA toolkit software stack with type-safe, memory-safe Rust code, letting you run GPU compute without nvcc, C++ toolchains, or SDK installs—just the NVIDIA driver at runtime. Developers get cuBLAS, cuDNN, cuFFT, cuSPARSE, cuSOLVER, and more in a single Rust crate, with PTX generation and autotuning for near-peak performance on Turing-to-Blackwell GPUs. It's a drop-in for high-perf linear algebra, deep learning ops, FFTs, and inference pipelines.

Why is it gaining traction?

No build-time CUDA deps means faster compiles and easier CI/CD, while Rust's safety eliminates segfaults and races common in CUDA C++. Autotuning hits 90-95% of native perf across precisions (FP8 to FP64), and multi-backend support (Vulkan, Metal, WebGPU, ROCm) adds portability beyond NVIDIA hardware. Early adopters praise the pure-Rust policy and 7k+ tests for reliable GPU kernels.

Who should use this?

Rust ML engineers training transformers or running ONNX inference on NVIDIA GPUs, scientific coders needing BLAS/FFT/sparse solvers without SDK headaches, and backend devs building cross-platform compute (e.g., WebGPU for browsers). Ideal for teams ditching CUDA's toolchain lock-in.

Verdict

Promising foundation for CUDA-free GPU dev, but 26 stars and 1.0% credibility signal early-stage—v0.1.1 lacks polished docs and prod benchmarks. Experiment if you're Rust-native; wait for v1.0 if stability matters. (187 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.