web_fetch vs Smart Scraper

Same page: https://www.wikipedia.org — what each tool actually returns

Mode: Extract Everything
🌐 web_fetch
This domain is for use in documentation examples without needing permission. Avoid use in operations. [Learn more](https://iana.org/domains/example)
🕷️ Smart Scraper
[smart-scraper] Extracted from https://www.wikipedia.org: Title: Wikipedia Headings: 1 Paragraphs: 1 Links: 330 Tables: 0 Lists: 6 Prices: 4 Images: 0 Metadata keys: 6 Content length: 118,282 chars
Mode: Tables Only
🌐 web_fetch
(You get raw HTML. Tables are buried in <table> tags mixed with everything else. You have to parse them yourself.)
🕷️ Smart Scraper
[smart-scraper] Tables found (0):
Mode: Lists Only
🌐 web_fetch
(You get raw HTML. <ul> and <li> tags mixed into the noise. You have to extract them yourself.)
🕷️ Smart Scraper
[smart-scraper] Lists found (6): 1 items: • العربية Deutsch English Español فارسی Français Italiano مصرى Nederlands 日本語 Pols 1 items: • Afrikaans Shqip Asturianu Azərbaycanca Български 閩南語 / Bân-lâm-gú বাংলা Беларуск 1 items: • Bahsa Acèh Alemannisch አማርኛ Aragonés Արեւմտահայերէن Bahasa Hulontalo Basa Bali B ...
Mode: Article Content
🌐 web_fetch
We owe you an explanation. You deserve an explanation, so please don't skip this 1-minute read. Our fundraiser won't last long... (Just the first paragraph the readability extractor grabbed. You don't know if there's more content, what the headings are, or how many links exist.)
🕷️ Smart Scraper
[smart-scraper] Article content: Title: Wikipedia Headings: 1 Paragraphs: 1 First paragraphs: We owe you an explanation. You deserve an explanation, so please don't skip this 1-minute read. Our fundraiser won't last...
Feature Comparison
Feature web_fetch Smart Scraper
Title extraction ❌ raw HTML only title: "Wikipedia"
Headings ❌ you parse them Headings: 1
Links ❌ you regex them Links: 330
Tables ❌ raw HTML Tables found (0)
Lists ❌ raw HTML Lists: 6 + items
Prices ❌ you detect them Prices: 4
Images ❌ raw HTML Images: 0
Metadata ❌ raw HTML Metadata keys: 6
Caching ❌ no cache ✅ 5min TTL, LRU eviction
Structured output ❌ raw HTML → text ✅ parsed data summary