Production-style real-time e-commerce lakehouse with Kafka, Airflow, Databricks, Medallion architecture, data quality, quarantine, Terraform, and Dash analytics.
StreamCommerce Lakehouse 360 is a portfolio project that demonstrates how a complete e-commerce analytics platform works. It simulates realistic shopping events (orders, payments, inventory, shipments, customer behavior, and product changes), processes them through a three-stage data pipeline (called Bronze, Silver, and Gold layers), catches and quarantines bad data, and presents business insights through an interactive dashboard. The project includes everything needed to run locally for demonstration purposes, as well as patterns for deploying to cloud services.
How It Works
Someone shares this project with you as an example of a complete data platform that handles everything from shopping events to business reports.
With a single command, you start up all the services: event generators, data processors, and the analytics dashboard all come to life automatically.
Realistic e-commerce events appear - orders, payments, inventory changes, customer clicks - each one moving through the data pipeline automatically.
Valid records continue through the pipeline, getting organized and prepared for analysis.
Records with missing IDs, wrong prices, or invalid statuses are saved separately so they can be investigated later.
A colorful dashboard shows revenue trends, top products, customer behavior, inventory health, and payment reliability - all updated as new data arrives.
You've seen end-to-end how a modern data platform works - from raw events to business insights - ready to demonstrate or learn from.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.