diff --git a/docs.json b/docs.json index 68da0db..ab7bf64 100644 --- a/docs.json +++ b/docs.json @@ -23,6 +23,7 @@ "introduction", "install", "transition-from-v1-to-v2", + "transition-from-firecrawl", { "group": "Use Cases", "pages": [ diff --git a/transition-from-firecrawl.mdx b/transition-from-firecrawl.mdx new file mode 100644 index 0000000..66957f0 --- /dev/null +++ b/transition-from-firecrawl.mdx @@ -0,0 +1,134 @@ +--- +title: Transition from Firecrawl to ScrapeGraph v2 +description: A practical guide for migrating your scraping workflows from Firecrawl to ScrapeGraph v2 +--- + +## Why switch? + +ScrapeGraph v2 offers AI-powered scraping, extraction, search, crawling, and monitoring through a unified API at a competitive price. If you're coming from Firecrawl, this page maps every endpoint, SDK method, and concept to its ScrapeGraph equivalent so you can migrate quickly. + +## Feature comparison at a glance + +| Capability | Firecrawl | ScrapeGraph v2 | +|---|---|---| +| Single-page scrape | `POST /v2/scrape` | `POST /api/v2/scrape` | +| Structured extraction | `POST /v2/extract` (LLM) | `POST /api/v2/extract` (LLM) | +| Web search | `POST /v2/search` | `POST /api/v2/search` | +| Crawl (multi-page) | `POST /v2/crawl` (async) | `POST /api/v2/crawl` (async) | +| Monitored changes | Change tracking (format option) | `POST /api/v2/monitor` (first-class, cron-based) | + +## Authentication + +| | Firecrawl | ScrapeGraph v2 | +|---|---|---| +| Header | `Authorization: Bearer fc-...` | `SGAI-APIKEY: sgai-...` | +| Env var | `FIRECRAWL_API_KEY` | `SGAI_API_KEY` | +| Base URL | `https://api.firecrawl.dev/v2` | `https://api.scrapegraphai.com/api/v2` | + +## SDK installation + +| | Firecrawl | ScrapeGraph v2 | +|---|---|---| +| Python | `pip install firecrawl-py` | `pip install scrapegraph-py` | +| Node.js | `npm i @mendable/firecrawl-js` | `npm i scrapegraph-js` | +| CLI | — | `npm i -g just-scrape` | +| MCP server | — | `pip install scrapegraph-mcp` | + +## Migration checklist + + + +### Update dependencies + +```bash +# Remove Firecrawl +pip uninstall firecrawl-py # Python +npm uninstall @mendable/firecrawl-js # Node.js + +# Install ScrapeGraph +pip install scrapegraph-py # Python +npm install scrapegraph-js # Node.js +``` + +### Update environment variables + +```bash +# Replace +# FIRECRAWL_API_KEY=fc-... + +# With +SGAI_API_KEY=sgai-... +``` + +Get your API key from the [dashboard](https://scrapegraphai.com/dashboard). + +### Update imports and client initialization + +```python +# Before +from firecrawl import Firecrawl +fc = Firecrawl(api_key="fc-...") + +# After +from scrapegraph_py import ScrapeGraphAI +# reads SGAI_API_KEY from env, or pass explicitly: ScrapeGraphAI(api_key="...") +sgai = ScrapeGraphAI() +``` + +```javascript +// Before +import Firecrawl from "@mendable/firecrawl-js"; +const fc = new Firecrawl({ apiKey: "fc-..." }); + +// After +import { ScrapeGraphAI } from "scrapegraph-js"; +// reads SGAI_API_KEY from env, or pass explicitly: ScrapeGraphAI({ apiKey: "..." }) +const sgai = ScrapeGraphAI(); +``` + +### Replace method calls + +Use the endpoint mapping tables above to update each call. The main patterns: + +- `fc.scrape()` -> `sgai.scrape(ScrapeRequest(...))` +- `fc.extract()` -> `sgai.extract(ExtractRequest(...))` +- `fc.search()` -> `sgai.search(SearchRequest(...))` +- `fc.start_crawl()` -> `sgai.crawl.start(CrawlRequest(...))` +- Change tracking -> `sgai.monitor.create(MonitorCreateRequest(...))` + +### Handle the `ApiResult` wrapper + +The ScrapeGraph Python and JS SDKs wrap every response in an `ApiResult` — no exceptions to catch. Check `status` before reading `data`: + +```python +result = sgai.extract(ExtractRequest(url="https://example.com", prompt="...")) +if result.status == "success": + data = result.data +else: + print(f"Error: {result.error}") +``` + +```javascript +const result = await sgai.scrape({ url: "https://example.com", formats: [{ type: "markdown" }] }); +if (result.status === "success") { + console.log(result.data?.results.markdown?.data); +} else { + console.error(result.error); +} +``` + +Direct HTTP callers (curl, fetch) receive the unwrapped response body — the envelope is applied client-side by the SDKs. + +### Test and verify + +Run your existing test suite and compare outputs. ScrapeGraph returns equivalent data structures — the main difference is the `ApiResult` envelope in the SDKs. + + + +## Full SDK documentation + +- [Python SDK](/sdks/python) +- [JavaScript SDK](/sdks/javascript) +- [CLI (just-scrape)](/services/cli/introduction) +- [MCP Server](/services/mcp-server/introduction) +- [API Reference](/api-reference/introduction)