SouravRoy-ETL

Fast in-process SQL database for your data . A drop-in DuckDB alternative with Parquet/CSV/JSON/Avro/Arrow/SQLite built in. 1.1 - 8.6× faster.

14
2
100% credibility
Found Apr 23, 2026 at 14 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
C++
AI Summary

SlothDB is a portable embedded analytical database engine that queries files in formats like Parquet, CSV, JSON, Avro, Arrow, SQLite, and Excel directly using SQL.

How It Works

1
💡 Discover SlothDB

You hear about a simple tool that lets you ask questions of your data files using everyday language, like chatting with your spreadsheets.

2
📦 Get it ready

With one quick step on your computer, everything is set up and waiting for you—no complicated setup needed.

3
🚀 Try your first question

You type a simple question about sample sales data, and in seconds you see totals grouped by region, feeling the speed right away.

4
📁 Point at your files

Just tell it to look at your CSV, Parquet, or Excel files, and it reads them directly without copying anything.

5
📊 See your answers

Watch as it crunches numbers, sums sales, or counts rows super fast, showing results in a neat table.

🎉 Unlock your data magic

Now you effortlessly get insights from any files, saving hours and feeling like a data wizard.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 14 to 14 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is slothdb?

SlothDB is a C++20 embedded SQL engine for in-process analytics, querying Parquet, CSV, JSON, Avro, Arrow, SQLite, and Excel files directly—no tables to create or data to import. Point SQL at local paths, HTTP/S3 URLs, or globs like `SELECT SUM(revenue) FROM 's3://bucket/sales.parquet' GROUP BY region`, and get results 1.1-8.6x faster than DuckDB. Use via CLI (`slothdb file.slothdb`), Python (`db.sql("query").fetchdf()` to pandas), Node (`npm i @slothdb/wasm`), or browser playground.

Why is it gaining traction?

It packs 7 native formats and remote reads out-of-box (no extensions), a lean 8MB binary, and reliable speedups like 5x on CSV counts or 5.4x on Avro sums, shining in fast processing speed tests on GitHub Actions or AWS. Stable C ABI means extensions survive upgrades, unlike alternatives, while CLI/Python APIs mirror familiar tools but with fused scan-aggregates for quick wins on 1M-row benchmarks.

Who should use this?

ETL devs needing fast GitHub Runners or AWS Lambda queries on mixed formats, notebook analysts scanning Parquet/CSV without bulk loads, or tool builders embedding local OLAP in apps. Skip if you need distributed queries or multi-writer txns—it's single-node OLAP.

Verdict

Early with 14 stars and 1.0% credibility score, but 359 passing tests and transparent benchmarks make it prototype-ready for fast processing carbs in pipelines. Benchmark your workload via `pip install slothdb; slothdb.demo()` before swapping DuckDB.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.