Per0x1de-1337

Light-weight self healing scraper

19
3
100% credibility
Found May 12, 2026 at 19 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

Selfmend is a web scraping tool that uses AI to automatically detect and repair broken element locators when websites change, making data extraction reliable over time.

How It Works

1
🔍 Discover Selfmend

You find Selfmend when tired of scrapers breaking every time a website changes a button or label.

2
📦 Set it up

You easily install it and connect a smart AI helper to make it self-repairing.

3
Tell it what to grab
🖱️
Click to pick

A browser opens, you click fields on the page, and it figures out stable spots to grab data from.

💬
Describe simply

Type what you want in everyday language, and the AI drafts a plan from a quick page peek.

4
▶️ Start scraping

Point it at your website, and it begins collecting data across pages automatically.

5
🛠️ Auto-fixes breaks

If a spot stops working, it smartly finds and tests a new one, keeping everything running smoothly.

📊 Get perfect data

You receive a clean file full of the info you wanted, with reports on what it fixed along the way.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 19 to 19 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is selfmend?

selfmend is a light-weight self-healing Python scraper on Playwright that mends broken selectors when sites tweak CSS classes or schemas. Feed it a YAML spec describing fields like "title, price in pounds," and it scrapes lists or pages with stealth browsing, human-like mouse moves, and automatic fixes via LLM on the accessibility tree—validated by regex, type, and range checks before caching winners in SQLite. CLI commands like `selfmend init-interactive` let you point-and-click to build specs without writing YAML, or use `selfmend init` for LLM-drafted ones from plain English.

Why is it gaining traction?

Unlike brittle scrapers that die on class renames, selfmend self-heals on the fly, tracks selector success rates with `selfmend stats`, detects schema drift between runs, and replays failed heals offline. The interactive browser picker auto-detects repeating rows for lists, while stealth patches and bezier mouse curves dodge bot detection—users get reliable data without monthly fire drills. Low OpenAI costs via gpt-4o-mini make it practical for daily pipelines.

Who should use this?

Data engineers maintaining product catalogs or job listings that break on site updates. Indie devs scraping e-commerce for price trackers or research tools. Teams with 10+ scrapers tired of YAML tweaks and silent failures in JSON outputs.

Verdict

Try it for light-weight self-healing on Playwright jobs—CLI shines, README covers setup fast—but with 19 stars and 1.0% credibility, it's early alpha: no tests exposed, expect rough edges. Solid for prototypes, watch for stability.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.