Skip to main content
Firecrawl logo

Overview

Add Firecrawl to any ModuleX agent or workflow. AI-powered web scraping, crawling, and search against the Firecrawl v1 REST API (api.firecrawl.dev/v1). Covers single-URL scraping, URL discovery (map), web search, multi-page crawls with job-id + status polling, LLM-based structured extraction, and batch scraping.
Categories: Web Search & Scraping · Data · Search · Auth: API Key, ModuleX Managed Key · Actions: 7

Authentication

API Key Authentication

Authenticate using your Firecrawl API key

Required Credentials

FieldDescriptionRequiredFormat
Firecrawl API KeyYour Firecrawl API key for authenticationYesfc-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Available Actions

Parameters

url
string
required
The URL to scrape
formats
array
Content formats: markdown, html, rawHtml, screenshot, links, summary (Default: ["markdown"])
only_main_content
boolean
Extract only the main content, filtering out navigation/footers (Default: true)
include_tags
array
HTML tags to specifically include in extraction
exclude_tags
array
HTML tags to exclude from extraction
wait_for
integer
Time in milliseconds to wait for dynamic content
mobile
boolean
Use mobile viewport
remove_base64_images
boolean
Remove base64-encoded images from output
max_age
integer
Maximum age in milliseconds for cached content. Enables faster scrapes for cached pages.

Response

{
  "additionalProperties": false,
  "properties": {
    "success": {
      "title": "Success",
      "type": "boolean"
    },
    "error": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Error"
    },
    "data": {
      "anyOf": [
        {
          "additionalProperties": true,
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Data"
    }
  },
  "required": [
    "success"
  ],
  "title": "ScrapeOutput",
  "type": "object"
}

Parameters

url
string
required
Starting URL for URL discovery
Optional search term to filter URLs
sitemap
string
Sitemap handling: ‘include’, ‘skip’, or ‘only’
include_subdomains
boolean
Include URLs from subdomains in results
limit
integer
Maximum number of URLs to return
ignore_query_parameters
boolean
Do not return URLs with query parameters (Default: true)

Response

{
  "additionalProperties": false,
  "properties": {
    "success": {
      "title": "Success",
      "type": "boolean"
    },
    "error": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Error"
    },
    "data": {
      "anyOf": [
        {
          "additionalProperties": true,
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Data"
    }
  },
  "required": [
    "success"
  ],
  "title": "MapWebsiteOutput",
  "type": "object"
}

Parameters

query
string
required
Search query string (supports operators)
limit
integer
Maximum number of results to return (Default: 5)
tbs
string
Time-based search filter
location
string
Location parameter for search results
scrape_options
object
Options for scraping search results

Response

{
  "additionalProperties": false,
  "properties": {
    "success": {
      "title": "Success",
      "type": "boolean"
    },
    "error": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Error"
    },
    "data": {
      "anyOf": [
        {
          "additionalProperties": true,
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Data"
    }
  },
  "required": [
    "success"
  ],
  "title": "SearchOutput",
  "type": "object"
}

Parameters

url
string
required
Starting URL for the crawl
exclude_paths
array
URL paths to exclude from crawling
include_paths
array
Only crawl these URL paths
max_depth
integer
Maximum depth to crawl relative to the entered URL
limit
integer
Maximum number of pages to crawl (Default: 100)
Allow crawling links to external domains (Default: false)
Allow crawling links to parent paths (Default: false)
ignore_sitemap
boolean
Ignore the website sitemap when crawling (Default: false)
scrape_options
object
Options for scraping each page

Response

{
  "additionalProperties": false,
  "properties": {
    "success": {
      "title": "Success",
      "type": "boolean"
    },
    "error": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Error"
    },
    "data": {
      "anyOf": [
        {
          "additionalProperties": true,
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Data"
    }
  },
  "required": [
    "success"
  ],
  "title": "CrawlOutput",
  "type": "object"
}

Parameters

crawl_id
string
required
Crawl job ID returned from the crawl action

Response

{
  "additionalProperties": false,
  "properties": {
    "success": {
      "title": "Success",
      "type": "boolean"
    },
    "error": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Error"
    },
    "data": {
      "anyOf": [
        {
          "additionalProperties": true,
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Data"
    }
  },
  "required": [
    "success"
  ],
  "title": "CheckCrawlStatusOutput",
  "type": "object"
}

Parameters

urls
array
required
Array of URLs to extract information from
prompt
string
Custom prompt for the LLM extraction
schema_definition
object
JSON schema for structured data extraction
Allow extraction from external links (Default: false)
Enable web search for additional context (Default: false)
include_subdomains
boolean
Include subdomains in extraction (Default: false)

Response

{
  "additionalProperties": false,
  "properties": {
    "success": {
      "title": "Success",
      "type": "boolean"
    },
    "error": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Error"
    },
    "data": {
      "anyOf": [
        {
          "additionalProperties": true,
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Data"
    }
  },
  "required": [
    "success"
  ],
  "title": "ExtractOutput",
  "type": "object"
}

Parameters

urls
array
required
Array of URLs to scrape
formats
array
Content formats to extract (Default: ["markdown"])
only_main_content
boolean
Extract only the main content (Default: true)

Response

{
  "additionalProperties": false,
  "properties": {
    "success": {
      "title": "Success",
      "type": "boolean"
    },
    "error": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Error"
    },
    "data": {
      "anyOf": [
        {
          "additionalProperties": true,
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Data"
    }
  },
  "required": [
    "success"
  ],
  "title": "BatchScrapeOutput",
  "type": "object"
}

Limits & Quotas

  • HTTP timeouts: 120s for scrape/map/search/status; 180s for crawl/extract/batch (long-running jobs).
  • Snake_case input parameters are converted to camelCase for the upstream API (only_main_contentonlyMainContent, etc.).
  • Response data carries the upstream JSON body unchanged so callers see the rich nested metadata Firecrawl returns.
  • Failures (non-2xx, timeouts, parse errors) surface as success=False + error; empty/blank API keys short-circuit.

Exa Search

Linkup

Serper