WillyEverGreen

WillyEverGreen / acon

Public

The intelligence layer for any web scraper. Pair with Scrapling, Playwright, or httpx to crawl smarter.

15
0
100% credibility
Found May 06, 2026 at 15 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

Acon is a smart website explorer that automatically maps site structures and page patterns to make data collection efficient and targeted.

How It Works

1
🌐 Discover a smarter way to explore websites

You want to understand a website's hidden structure—like product pages, categories, or articles—without endlessly clicking links.

2
📦 Get Acon ready in moments

Download this friendly tool with a simple one-click install, no complicated setup needed.

3
🔗 Point it to your website

Just share the main web address, like your online store or news site, and set a gentle limit on how much to explore.

4
🧠 Let Acon map the site intelligently

It smartly wanders the site, spotting patterns in pages and layouts, skipping repeats to learn the full blueprint fast.

5
📊 Review your site blueprint

Get a clear summary of page types, navigation flows, and the site's overall shape, ready to use right away.

Harvest data smarter and faster

Now you know exactly where to focus, saving time and effort on monitoring prices, archiving content, or analyzing sites.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 15 to 15 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is acon?

Acon is a Python library serving as the intelligence layer for web scrapers, taking a seed URL and automatically mapping a site's topology—like paginated e-commerce or deep uniform structures. It classifies page types, prioritizes discovery via sitemaps and links, and outputs summaries with clean Markdown content via Trafilatura, ready for tools like Scrapling, Playwright, or httpx. Like a business intelligence layer ai or decision intelligence layer, it learns the site's DNA fast, avoiding blind crawling.

Why is it gaining traction?

Benchmarks crush blind BFS: 97% fewer requests on books.toscrape.com (6 vs 200 pages), 92% on Wikipedia. Early stopping on low information gain, static-first hybrid escalation, and asset blocking deliver real bandwidth wins—think code intelligence github for scrapers. Stealth Camoufox integration bypasses bot walls, making it a smarter intelligence layer meaning than raw httpx loops.

Who should use this?

E-commerce price trackers needing pagination maps without selectors. Content archivists feeding publications into LLMs. SEO auditors wanting instant topology reports (SPA vs static). Backend devs pairing it with Scrapling for production crawls, tired of URL exhaustion on sites like PyPI or Hacker News.

Verdict

Grab it for prototypes—benchmarks and README docs impress, but 15 stars and 1.0% credibility scream early beta; test resumption and scale yourself. Solid if you're on Python 3.10+ with Playwright.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.