Cintu07

Cintu07 / ciot

Public

cpu inference for ternary neural nets. no deps. just c++ and simd.

10
0
89% credibility
Found May 25, 2026 at 17 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

Ciot is a lightweight AI inference engine that runs neural networks directly on your computer's CPU. It uses a special technique called ternary weights, where each number in the AI model is limited to just three values: -1, 0, or +1. This dramatically simplifies the math and lets the computer process many calculations at once using built-in processor instructions. The project includes everything needed: a fast inference engine written in C++, tools to train tiny AI models from text files, and comprehensive benchmarks that prove the speed is real by checking that results are mathematically correct. Users can train their own small text-generating models and run them entirely on CPU hardware, achieving billions of operations per second without GPUs, frameworks, or external services.

How It Works

1
πŸ’‘ You hear about lightning-fast AI on your laptop

A friend tells you about a project that runs AI models directly on your computer's processor, no expensive graphics card needed.

2
πŸ”’ You discover the clever trick behind it

Instead of complex numbers, the AI uses only three values: -1, 0, and +1. This makes everything run incredibly fast on any CPU.

3
πŸ”¨ You build your AI engine in seconds

With one simple command, your computer compiles everything. The program detects your processor type and sets itself up automatically.

4
πŸ“š You teach it your own tiny model

You feed it a text file you like, and the training script teaches a small AI to talk like that text. It learns word patterns and creates its own vocabulary.

5
You choose what to explore
✍️
Generate text with your model

Give your AI a starting word and watch it continue the story, choosing each next word based on what it learned.

⚑
Run speed benchmarks

See exactly how many calculations your computer can perform per second, with proof that the results are correct.

πŸŽ‰ Your AI runs on any computer

You've got a working AI assistant that generates text on your laptop's CPU, running several billion operations per second without needing any special hardware.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 17 to 10 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is ciot?

Ciot is a CPU inference engine that runs neural networks with weights quantized to just three values: -1, 0, and +1. Instead of floating-point matrices, it packs 64 weights into two 64-bit integers using bit-plane encoding. The project is built in C++ with hand-written SIMD kernels for ARM NEON, AVX2, and AVX-512, plus Python scripts for training and converting models to the .bits format. No PyTorch, no TensorFlow, no BLAS. Just compile and run.

The inference core handles transformer operations including multi-head attention, KV caching, and rotary position embeddings. A complete pipeline exists from text training data through quantization to working model generation.

Why is it gaining traction?

The hook is simplicity and speed. A 1024x1024 ternary matrix multiply hits around 4 billion operations per second on Snapdragon X with ARM NEON, roughly 5.7x faster than the scalar path. More importantly, every benchmark includes a checksum that matches the reference implementation, proving the SIMD kernels produce correct results. The project also has zero heavy dependencies in the inference core, making it genuinely portable.

Who should use this?

Researchers exploring extreme quantization for edge deployment will find the most value here. Embedded systems developers who need inference without framework overhead. Anyone curious about what ternary networks actually achieve in practice, since the full training-to-inference pipeline is self-contained and readable.

Verdict

This is a legitimate proof-of-concept with a clean, well-documented core. The checksum verification approach isε€ΌεΎ— (worth noting) for anyone building SIMD kernels. However, with only 10 stars and limited documentation, treat this as a technical exploration rather than production infrastructure. The 0.8999999761581421% credibility score reflects an early-stage project with real engineering underneath. Worth watching if CPU-based ternary inference fits your roadmap.

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.