Fast by default
Sub-second on straightforward pages.
ByteKit returns clean markdown, HTML, screenshots, search results, and page metadata from public URLs. Straightforward pages can return in under a second. Protected sites work through the same API. Failed zero-byte captures cost nothing.
{
status: "success",
scrapeId: "sc_01j9abc123",
contentLength: 18432,
formats: { markdown: "#..." }
}No browser fleet. No proxy pile. No stale memory dressed up as confidence.
Sub-second on straightforward pages.
Same tool call when pages resist.
Cache hits cost half. Failed zero-byte captures cost $0.
An agent without live sources is guessing with better punctuation.
It can remember old docs, hallucinate changed prices, cite broken pages, and miss the one line that matters. Give it a web capture tool and make it look before it speaks.
Pull the live page before the agent repeats something expensive.
Fetch framework docs, API references, changelogs, and help center articles.
Product pages, plan pages, stock states, regional pages.
Return title, final URL, contentLength, and markdown.
Capture screenshots when text alone is not enough.
Use /sitemap when the agent has the domain but
not the page — get the site's public URLs first.
Give compatible agents scrape and screenshot tools directly.
Call /v1/scrape,
/v1/screenshots, or
/v1/monitors from your own orchestration code.
bytekit_scrape({ url, formats: ["markdown"], country }) → envelope | If you build it | You also inherit | ByteKit handles |
|---|---|---|
| Playwright capture | Browser updates, queues, timeouts | Capture, screenshots, and recordings |
| Proxy routing | Countries, sessions, vendor quirks | Geo-targeted requests and sane defaults |
| Content cleanup | Navigation, headers, footers, cookie junk | Markdown tuned for LLM and RAG workflows |
Need full browser automation with clicks, forms, and logged-in workflows? Use Playwright. ByteKit is for source capture inside agents, not for pretending every website is fair game.