borchero

Native SDKs for DuckLake.

42
1
100% credibility
Found May 23, 2026 at 42 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Rust
AI Summary

DuckLake SDK is a software development kit that lets programmers work with data lakes in Python and Rust. A data lake is a way to store large amounts of data in organized, queryable files. This SDK provides tools to create tables, write data into them, read data back out, and even travel back in time to see historical versions of your data. It works with common data tools like Polars and DuckDB, and can store data either locally on your computer or in cloud storage like AWS S3. The project is designed to be a lightweight alternative to DuckDB's official data lake extension, giving developers more flexibility in how they build data applications.

How It Works

1
πŸ’‘ You need a better way to manage data

You have lots of data files scattered around and want a smarter way to organize, query, and version them like a professional data engineer.

2
πŸ—„οΈ You create your data lake

You set up a DuckLake instance with a simple SQLite database to track your tables and a folder to store your data files.

3
πŸ“‹ You define your tables

You create tables with specific columns and data types, just like setting up a well-organized spreadsheet with rules.

4
πŸ“ You write your data

Using Python tools you already know like Polars, you easily write your data into the lake where it gets stored efficiently as Parquet files.

5
You can work with your data in different ways
πŸ”
Query current data

Read the latest version of your tables and get immediate results.

βͺ
Time travel through history

Look back at what your data looked like at any point in time, like having a time machine for your tables.

✨ Your data is organized and accessible

Everything is neatly cataloged, versioned, and ready to use. You can share your data lake with teammates and build powerful data workflows.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 42 to 42 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is ducklake-sdk?

ducklake-sdk is a native SDK for accessing DuckLake data lakes directly from Rust and Python, with no dependency on DuckDB. DuckLake itself is a data lake format that stores table metadata in a SQL database and writes actual data as Parquet files. This SDK gives you a client library to read, write, and manage those tables programmatically. The Rust core powers everything, with Python bindings exposing the same functionality. You can work with schemas, tables, partitions, and transactions directly from code rather than through DuckDB's SQL interface.

Why is it gaining traction?

The key hook here is avoiding DuckDB as a dependency. If you want to read DuckLake-formatted data without pulling in the full DuckDB runtime, this fills that gap. Time travel queries let you query data at previous snapshots without needing additional tooling. The SDK handles schema evolution, partitioning, and constraint management natively, which can be tedious to implement manually. For teams already using Polars or DuckDB for data processing, the Python SDK's direct integration means you can land data into a DuckLake without leaving your existing Python workflow. The transaction system with conflict resolution also makes it viable for concurrent writes.

Who should use this?

Data engineers building pipelines that produce or consume DuckLake-formatted data and不想 depend on DuckDB. Backend developers writing services that need to manage table schemas or metadata programmatically. Teams using Polars who want a lightweight way to write data into a DuckLake without the DuckDB dependency. If you're currently using DuckDB's DuckLake extension and want to simplify your deployment, this is worth evaluating. It's less suitable if you need Google Cloud Storage or Azure Blob support, or if you're running on Windows.

Verdict

Given the 1.0% credibility score, 42 stars, and alpha status, this is a niche project for specific use cases rather than a general recommendation. The codebase appears well-structured with good test coverage and documentation, but the small community means you may be on your own for troubleshooting. Worth trying if the no-DuckDB dependency solves a real problem for you, but treat it as an early-stage project with all the accompanying risks.

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.