data-context-hq

Open-source runtime attribution and context observability for data access by AI agents and applications

15
1
89% credibility
Found May 19, 2026 at 15 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

DataContext is a Python library that helps developers understand which user requests, background jobs, or code paths triggered specific database queries. It works by wrapping your existing database code to automatically capture query events with information about where the query came from, how long it took, and what context was active. Events include details like the service name, database type, query fingerprint, and the full call stack showing which functions led to the query. You can also attach custom labels like user IDs, request IDs, or tenant names so every query carries that context. Events are sent to your choice of destination—console, file, custom handler, or OpenTelemetry—without changing your application behavior or blocking your code.

How It Works

1
💡 You realize you need better visibility

Your database logs show queries, but you can't tell which user request or code path triggered them.

2
📦 You add the library to your project

You install DataContext with a simple command, and optionally add support for SQLAlchemy or OpenTelemetry.

3
⚙️ You set up your service details

You tell DataContext your service name and environment, like naming your app 'checkout-api' in 'production'.

4
🔌 You connect your database functions

You point DataContext at your database code, and it automatically watches every query that runs.

5
🏷️ You optionally add context

You wrap your business logic with labels like 'checkout', 'user:123', or 'req_abc' so queries carry that information.

6
You choose where events go
🖥️
Console or file

Simple output that writes events as JSON lines to stdout or a log file.

🔗
Custom pipeline

Send events to your own handler, data warehouse, or analytics system.

🔍
OpenTelemetry

Connect to your existing tracing system to correlate queries with other spans.

🎯 You can finally answer the question

For any query in your logs, you now know exactly which request, user, and code path caused it.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 15 to 15 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is datacontext?

datacontext is a Python library that answers a question most backend developers struggle with: which request, job, or agent triggered this query? It wraps database access points in your application and emits structured events containing the query, where it came from in code, and who was responsible. The library automatically captures callsite information (file, line, function, full stack trace), normalizes and fingerprints queries, and correlates them with runtime context like actor IDs, request IDs, and OpenTelemetry trace information. Events flow to JSONL files, stdout, callbacks, or OpenTelemetry sinks depending on your pipeline.

Why is it gaining traction?

The hook here is simple: queries lose context. By the time a query hits your data warehouse or monitoring system, you cannot tell which user session, background job, or AI agent caused it. datacontext reconnects that severed chain without requiring you to instrument every individual query. Configuration is a one-time setup that wraps existing data-access functions. The library prioritizes production safety -- instrumentation failures fall back gracefully, sink errors get logged and dropped, and wrapped functions preserve their original return values and exceptions. This is observability that does not break your app.

Who should use this?

Platform teams building internal developer tooling who need visibility into how production services interact with databases. Backend developers debugging unexpected load patterns or trying to attribute slow queries to specific code paths. Teams running AI agents that query databases and need to understand which agent caused which query. Multi-tenant applications where you need per-tenant query attribution without modifying every database call site.

Verdict

datacontext shows strong design instincts -- a small stable schema, production-first error handling, and a clear problem statement. The credibility score of 0.8999% reflects genuine early-stage status: 15 stars, version 0.1.0, and limited community signals. If you need query attribution for internal tools or low-stakes projects, evaluate this now. For customer-facing production systems, wait for more adoption and a stable 1.0 release before trusting it on critical paths.

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.