cchenax / streamforge-ai
PublicOpen-source real-time AI data pipeline for CDC ingestion, feature generation, and storage-aware prefetching
This repository demonstrates a real-time data pipeline for AI workloads, including setups to capture database changes, prefetch frequently used files, and process events into aggregated features.
How It Works
You find this open-source project that shows how live data from changes in records flows into AI tools.
You start a simple local setup with one easy command to capture changes from sample customer records.
You add, update, or remove sample customers and instantly see the live events appear in the flow.
You run a quick tool that picks the most-needed data files and copies them to a fast local spot before your AI work begins.
You start a background job that watches the live events and counts how many changes happen for each customer over time.
Everything works smoothly on your computer, turning raw changes into ready-to-use features for machine learning projects.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.