abdimoallim

abdimoallim / psimd

Public

A portable, header-only SIMD library for C (SSE2, SSE4.1, AVX/AVX2+FMA, NEON/AArch64, WebAssembly SIMD128, scalar fallback)

11
1
100% credibility
Found Apr 03, 2026 at 11 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
C
AI Summary

psimd is a single-header C library offering a consistent way to perform fast batch math operations on multiple computer types.

How It Works

1
🔍 Discover psimd

You hear about a handy single-file toolkit that makes heavy number crunching in programs run much faster on everyday computers.

2
📥 Grab the file

Download the one small file and place it right next to your program's files so it's easy to use.

3
Add speedy math

Swap your usual step-by-step number work for quick batch tricks that handle four or eight numbers together at once.

4
🧪 Try the examples

Run the built-in checks and sample math jobs to see the speed boost in action right away.

5
⚙️ Tune for your setup

Pick a simple go-fast option that matches your computer's strengths, no changes needed.

🚀 Lightning calculations

Your program now flies through big math tasks smoothly on desktops, mobiles, or web browsers, saving time and power.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 11 to 11 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is psimd?

psimd delivers a portable, header-only SIMD library for C, letting you accelerate numerical loops across x86 (SSE2, SSE4.1, AVX/AVX2+FMA), ARM (NEON/AArch64), WebAssembly SIMD128, and scalar fallback. Include one header, pick compiler flags like -mavx2 or -mfpu=neon, and use fixed types like f32x4 for always-4-floats vectors with ops like psimd_add_f32x4 or mask selects. It hides intrinsics mess, ensuring your code compiles and runs anywhere without ISA tweaks.

Why is it gaining traction?

Zero deps beyond stdlib, explicit widths prevent platform surprises, and masks feed branchless selects directly—matching SSE/AVX/NEON/WASM natively. Runtime backend checks confirm AVX/AVX2+FMA or scalar fallback, while 107 tests validate kernels like dot products or ReLU. As a github portable download, it's instant for C loops in portable github apps or CLI tools.

Who should use this?

C devs building numerical kernels for ML inference, image filters, audio DSP, or games needing vector math on desktop/mobile/WASM. Ideal for embedded engineers porting AVX/AVX2+FMA code to NEON/AArch64 with scalar fallback, or WASM ports of compute-heavy github portable git tools.

Verdict

Grab it for portable SIMD in new C projects—docs shine, tests pass everywhere, API feels natural. Low 11 stars and 1.0% credibility mean watch for edge cases, but header-only lowers risk.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.