mechramc

mechramc / Orion

Public

Local AI runtime for training & running small LLMs directly on Apple Neural Engine (ANE). No CoreML. No Metal. Offline, on-device fine-tuning & inference on M-series silicon.

10
1
69% credibility
Found Mar 06, 2026 at 10 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Objective-C
AI Summary

Orion enables running and training small language models locally on Apple Silicon Macs using the device's neural engine for fast, private, offline AI experiences.

How It Works

1
🔍 Discover Orion

You find a cool tool that lets everyday Macs run smart AI chats and even learn new things, all without sending data online.

2
🛠️ Set it up

With a few easy steps on your Mac, you build Orion and get everything ready to go.

3
🧠 Add an AI brain

You grab a small ready-made language model and prepare it so Orion can use it right away.

4
Pick your fun
💬
Chat away

Type a question or story starter, and watch the AI reply super fast on your Mac.

📚
Train it

Give it simple stories or data, and let it learn to get even better over time.

5
Feel the speed

Your Mac's built-in brainpower makes the AI think lightning-fast, with total privacy since nothing leaves your device.

🎉 Your private AI buddy

Now you have a personal, offline AI for fun chats, writing help, or custom learning, running smoothly on your Mac.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 10 to 10 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is Orion?

Orion is a local AI runtime for training and running small LLMs like GPT-2 124M and Stories110M directly on Apple Neural Engine hardware in M-series Macs. Built in Objective-C, it bypasses CoreML and Metal for offline, on-device inference at 170+ tok/s and fine-tuning with delta compilation that skips full recompiles during training. Fire up the CLI with `make` then `./orion infer --prompt "..." --ane` or `./orion train --dataset tinystories.bin`—no cloud, no data leaves your device.

Why is it gaining traction?

It stands alone as the only tool hitting ANE for both inference and training, outpacing Metal-based alternatives like MLX by tapping idle NPU silicon for 3.8x faster steps via weight hot-swaps and LoRA adapters without recompiling. Benchmarks track tok/s, TFLOPS, and ANE utilization out of the box, plus program caching keeps things snappy after the initial 4.5s compile. Devs digging into kagi orion github or master of orion github find a polished CLI that doubles as a local runtime assistant ui for edge AI experiments.

Who should use this?

Apple Silicon ML engineers prototyping local github copilot alternatives or privacy-first copilots on Macs. Robotics hackers embedding offline LLMs in iPad apps, or researchers benchmarking ANE as a local github actions runner for tiny model training without GPU dependency. Ideal if you're chasing on-device fine-tuning for medical/legal apps where data stays local.

Verdict

Grab it if you're on M-series and want ANE-native LLM speed—v1.0 is complete with passing tests and docs, despite just 10 stars. Credibility score sits at 0.699999988079071%, signaling early-stage risks from private APIs, but the delta-compile hook makes it worth a spin for local runtime experiments.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.