tinyfish-io

What if you had all the data in the world?

44
5
85% credibility
Found May 22, 2026 at 44 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
TypeScript
AI Summary

BigSet is a data platform that lets anyone create live, automatically-updating datasets by simply describing what they want in plain English. You tell it what data you need — like 'YC companies currently hiring engineers' or 'GPU prices across retailers' — and it figures out the structure, collects the data from across the web, and keeps it fresh on a schedule you choose. The platform includes curated public datasets you can explore immediately, and lets you create private datasets that only you can see. Once your dataset is built, you can browse it in a table view, search and filter, and export to spreadsheet formats. Built by TinyFish on open-source technology.

How It Works

1
🔍 Discover live datasets

You land on the homepage and see curated datasets about AI companies hiring, GPU prices, and model pricing — all updating automatically.

2
📝 Sign up for free

You create an account to access the dashboard where you can manage your own datasets alongside the curated ones.

3
Describe what you want

You type a plain-English description like 'YC companies currently hiring engineers with their funding stage and open roles' — and the AI figures out the perfect structure for your dataset.

4
Review and customize your schema

You see the columns the AI suggested, can rename them, change types, or add new ones — then confirm to create your dataset.

5
🔄 Watch it come to life

Your dataset starts building automatically. Web agents collect the data and keep it fresh on your chosen schedule — every 30 minutes, daily, or weekly.

6
Use your data
👀
Browse the table view

Scroll through your data in a clean table, filter by any column, and see updates as they happen.

📥
Export anytime

Download your data as a spreadsheet or CSV whenever you need it — fresh or historical.

🎉 Your data stays fresh

You have a live, queryable dataset that updates automatically — no scrapers to maintain, no pipelines to babysit.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 44 to 44 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is bigset?

BigSet is a live dataset platform that turns natural language descriptions into continuously updated, queryable data. You type something like "YC companies currently hiring engineers with their funding stage and open roles," and the system generates a schema, spins up web agents to collect the data, and keeps it fresh on a schedule you choose. The frontend is a Next.js app backed by a Fastify agent runner, with Convex handling the database and Clerk managing authentication. Data gets exported as CSV or XLSX directly from the UI.

Why is it gaining traction?

The pitch is simple: stop building one-off scrapers. Instead of maintaining brittle pipelines that break when a website changes, you describe what you want and agents handle the rest. The schema inference workflow accepts a prompt, calls an LLM via OpenRouter, and returns a structured dataset definition ready to populate. Refresh cadences run from every 30 minutes to weekly, so pricing data, job listings, and funding rounds stay current without manual intervention. Nine curated public datasets ship out of the box covering GPU prices, AI model costs, and open-source repository stats.

Who should use this?

Product teams that need reference data without the ops overhead of running scrapers. Researchers tracking dynamic markets like cloud GPU pricing or venture rounds. Developers building internal tools that depend on external data sources. Early adopters comfortable with a self-hosted Docker setup and an AGPL license. If you want production-grade reliability and comprehensive test coverage, wait for more maturity.

Verdict

BigSet solves a real problem with an elegant interface, but at 44 stars and 0.85% credibility, it is early-stage software with a complex stack and no visible test suite. Spin it up locally to evaluate the schema inference and agent runner. Do not deploy to production without auditing the authorization layer yourself.

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.