AndreSlavescu

python-based eDSL for efficient Metal Shading Language code generation

10
1
100% credibility
Found Mar 17, 2026 at 10 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

meTile is a Python toolkit for writing and running high-performance computations on Apple Silicon GPUs using simple tiled recipes that compile to optimized code.

How It Works

1
🔍 Discover fast math on your Mac

You hear about a simple way to make heavy calculations zoom using your Mac's built-in graphics power, without needing expert skills.

2
📦 Get it ready in moments

With one easy action, you bring this tool onto your Mac, and everything is set up automatically.

3
Copy a ready example

Paste a short recipe for something like multiplying big grids of numbers, tweak a few sizes if you want, and launch it.

4
Watch it outperform others

Run a quick check and see your computation finish much quicker than common alternatives, with charts showing the speedup.

5
Tune for your needs
🚀
Go super fast

It finds the perfect setup automatically, squeezing out top speed.

Keep it simple

Stick with the good default and move on.

🎉 Your computations fly

Now your math-heavy projects run blazing fast on everyday Mac hardware, saving time and energy.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 10 to 10 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is meTile?

meTile is a Python-based eDSL for generating efficient Metal Shading Language code tailored to Apple GPUs. Developers write tile-based kernels—like GEMM, softmax, or layernorm—using simple decorators and ops such as tile_load, dot, and tile_store, which compile to optimized MSL and dispatch via a ctypes Metal bridge with zero-copy unified memory. It solves the pain of hand-writing verbose Metal shaders for high-perf compute, delivering MLX-competitive speeds out of the box.

Why is it gaining traction?

Autotuning sweeps block sizes and layouts for peak perf, fused epilogues like ReLU cut memory traffic, and benchmarks show it matching or beating MLX on FFT, GEMM, and norms. The CuTe-inspired layout algebra and Triton-like Python syntax make custom kernels feel natural, without MSL boilerplate or PyObjC deps. Devs dig the pure-Python runtime and precompiled metallibs for instant deployment.

Who should use this?

ML engineers on Apple Silicon tuning GEMM or elementwise ops beyond stock frameworks. Researchers prototyping tiled neural net kernels for M-series chips. Anyone benchmarking custom Metal compute against MLX for production inference.

Verdict

Promising for efficient Metal code generation, with strong benchmarks, tests, and Makefile-driven dev workflow—but at 10 stars and 1.0% credibility, it's early alpha. Prototype with it now if you're deep in Apple GPU perf; track for maturity.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.