elixir-dux

elixir-dux / dux

Public

Distributed DataFrames for Elixir powered by DuckDB

11
2
100% credibility
Found Mar 24, 2026 at 11 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Elixir
AI Summary

Dux provides a simple, chainable way to load, transform, and analyze data files using DuckDB in Elixir applications.

How It Works

1
📚 Discover Dux

You find Dux, a friendly tool that makes crunching numbers from files as easy as sorting a spreadsheet.

2
🛠️ Add to your project

You add Dux to your Elixir setup with a single line, and it's ready to go.

3
📁 Load your data

You pull in files like CSV, Excel, or folders of data with simple commands.

4
Shape and summarize

You filter what matters, add calculations, group by categories, and see totals build up step by step.

5
🚀 Scale for big files

For huge datasets, you spread the work across extra helpers automatically.

🎉 Get your insights

Your results appear as neat tables or files, ready to explore or share your discoveries.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 11 to 11 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is dux?

Dux brings distributed dataframes to Elixir, powered by DuckDB's analytical engine. You get a familiar dplyr-style API—filter, mutate, group_by, summarise—with lazy pipelines that compile to optimized SQL and run across BEAM nodes for scalable processing. Load Parquet from S3, join Postgres tables, or run graph algorithms like PageRank, all without managing clusters or heavy RPC.

Why is it gaining traction?

Unlike github distributed alternatives like Spark or Polars, Dux leverages BEAM's distribution natively—no cluster managers, just ship plain data to nodes and execute locally with DuckDB. Developers love the zero-overhead IO (CSV, Excel, NDJSON), full SQL access for extensions, and built-in distributed writes that partition directly to storage. It's a comfyui distributed github for Elixir data workflows, blending tidyverse verbs with vectorized speed.

Who should use this?

Elixir data engineers building ETL pipelines on BEAM clusters, especially those processing S3 Parquet lakes or attached databases like Postgres/Iceberg. Analytics devs wanting distributed github lab-style aggregation without Java overhead, or teams exploring graph analytics on event streams. Skip if you're not on Elixir or need mature production stability.

Verdict

Promising for Elixir distributed dataframes, but at 1.0% credibility (11 stars) and pre-production, stick to prototypes—docs and guides are solid, but test under load first. Watch for 1.0; it'll slot nicely into Livebook or Nx pipelines.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.