Skip to main content
Scrape.do logo

Overview

Add Scrape.do to any ModuleX agent or workflow. Enterprise web-scraping API integration: basic HTTP, JS-rendered browser, screenshots, markdown conversion, and credit-usage stats. All five actions hit api.scrape.do (or /info for usage).
Categories: Web Search & Scraping · Data Extraction · Auth: API Key · Actions: 5

Authentication

API Key Authentication

Authenticate using your Scrape.do API key

Required Credentials

FieldDescriptionRequiredFormat
Scrape.do API KeyYour Scrape.do API key for authenticationYesxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Available Actions

Parameters

url
string
required
URL to scrape
method
string
HTTP method (GET, POST, PUT, DELETE, HEAD) (Default: GET)
body
string
Request body for POST/PUT requests
super_proxy
boolean
Use residential & mobile proxy network
geo_code
string
Country code for proxy location (e.g. ‘us’, ‘uk’, ‘de’)
regional_geo_code
string
Regional proxy location: ‘europe’, ‘asia’, ‘africa’, ‘oceania’, ‘northamerica’, ‘southamerica’
session_id
integer
Sticky session ID (0-1000000) for IP persistence
device
string
Device emulation (‘desktop’, ‘mobile’, ‘tablet’)
timeout
integer
Request timeout in ms (5000-120000)
retry_timeout
integer
Retry timeout in ms (5000-55000)
disable_retry
boolean
Disable automatic retry on failure
disable_redirection
boolean
Disable following redirects
custom_headers
boolean
Let Scrape.do add default headers
extra_headers
boolean
Forward extra upstream headers
forward_headers
boolean
Forward client headers to target
set_cookies
string
Cookies to send (JSON string or header)
block_resources
boolean
Block images, CSS, fonts to speed up loading
block_ads
boolean
Block advertisements
output
string
Output format (‘raw’ or ‘markdown’)
transparent_response
boolean
Return the origin response body with no parsing

Response

{
  "additionalProperties": false,
  "properties": {
    "success": {
      "title": "Success",
      "type": "boolean"
    },
    "error": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Error"
    },
    "status_code": {
      "anyOf": [
        {
          "type": "integer"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Status Code"
    },
    "content_type": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Content Type"
    },
    "data": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Data"
    },
    "is_binary": {
      "default": false,
      "title": "Is Binary",
      "type": "boolean"
    },
    "payload": {
      "anyOf": [
        {
          "additionalProperties": true,
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Payload"
    }
  },
  "required": [
    "success"
  ],
  "title": "ScrapeOutput",
  "type": "object"
}

Parameters

url
string
required
URL to scrape
method
string
HTTP method (GET, POST, PUT, DELETE, HEAD) (Default: GET)
body
string
Request body for POST/PUT requests
super_proxy
boolean
Use residential & mobile proxy network
geo_code
string
Country code for proxy location (e.g. ‘us’, ‘uk’, ‘de’)
regional_geo_code
string
Regional proxy location: ‘europe’, ‘asia’, ‘africa’, ‘oceania’, ‘northamerica’, ‘southamerica’
session_id
integer
Sticky session ID (0-1000000) for IP persistence
device
string
Device emulation (‘desktop’, ‘mobile’, ‘tablet’)
timeout
integer
Request timeout in ms (5000-120000)
retry_timeout
integer
Retry timeout in ms (5000-55000)
disable_retry
boolean
Disable automatic retry on failure
disable_redirection
boolean
Disable following redirects
custom_headers
boolean
Let Scrape.do add default headers
extra_headers
boolean
Forward extra upstream headers
forward_headers
boolean
Forward client headers to target
set_cookies
string
Cookies to send (JSON string or header)
block_resources
boolean
Block images, CSS, fonts to speed up loading
block_ads
boolean
Block advertisements
output
string
Output format (‘raw’ or ‘markdown’)
wait_until
string
Wait condition: ‘domcontentloaded’, ‘networkidle0’, ‘networkidle2’, ‘load’
wait_selector
string
CSS selector to wait for before capturing
custom_wait
integer
Additional wait time in ms
width
integer
Browser viewport width (Default: 1920)
height
integer
Browser viewport height (Default: 1080)
play_with_browser
string
JSON-encoded Play-with-Browser action list

Response

{
  "additionalProperties": false,
  "properties": {
    "success": {
      "title": "Success",
      "type": "boolean"
    },
    "error": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Error"
    },
    "status_code": {
      "anyOf": [
        {
          "type": "integer"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Status Code"
    },
    "content_type": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Content Type"
    },
    "data": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Data"
    },
    "is_binary": {
      "default": false,
      "title": "Is Binary",
      "type": "boolean"
    },
    "payload": {
      "anyOf": [
        {
          "additionalProperties": true,
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Payload"
    }
  },
  "required": [
    "success"
  ],
  "title": "ScrapeWithJsOutput",
  "type": "object"
}

Parameters

url
string
required
URL to capture
full_page
boolean
Capture full page instead of viewport (Default: false)
selector
string
CSS selector for element-specific screenshot
super_proxy
boolean
Use residential & mobile proxy network
geo_code
string
Country code for proxy location (e.g. ‘us’, ‘uk’, ‘de’)
regional_geo_code
string
Regional proxy location: ‘europe’, ‘asia’, ‘africa’, ‘oceania’, ‘northamerica’, ‘southamerica’
session_id
integer
Sticky session ID (0-1000000) for IP persistence
device
string
Device emulation (‘desktop’, ‘mobile’, ‘tablet’)
timeout
integer
Request timeout in ms (5000-120000)
retry_timeout
integer
Retry timeout in ms (5000-55000)
disable_retry
boolean
Disable automatic retry on failure
disable_redirection
boolean
Disable following redirects
width
integer
Viewport width (Default: 1920)
height
integer
Viewport height (Default: 1080)
wait_until
string
Wait condition for render completion
wait_selector
string
CSS selector to wait for before capturing
custom_wait
integer
Additional wait time in ms
block_ads
boolean
Block advertisements
custom_headers
boolean
Let Scrape.do add default headers
set_cookies
string
Cookies to send

Response

{
  "additionalProperties": false,
  "properties": {
    "success": {
      "title": "Success",
      "type": "boolean"
    },
    "error": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Error"
    },
    "status_code": {
      "anyOf": [
        {
          "type": "integer"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Status Code"
    },
    "content_type": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Content Type"
    },
    "data": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Data"
    },
    "is_binary": {
      "default": false,
      "title": "Is Binary",
      "type": "boolean"
    },
    "payload": {
      "anyOf": [
        {
          "additionalProperties": true,
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Payload"
    }
  },
  "required": [
    "success"
  ],
  "title": "TakeScreenshotOutput",
  "type": "object"
}

Parameters

url
string
required
URL to scrape
render
boolean
Enable JavaScript rendering (Default: false)
method
string
HTTP method (Default: GET)
body
string
Request body for POST/PUT
super_proxy
boolean
Use residential & mobile proxy network
geo_code
string
Country code for proxy location (e.g. ‘us’, ‘uk’, ‘de’)
regional_geo_code
string
Regional proxy location: ‘europe’, ‘asia’, ‘africa’, ‘oceania’, ‘northamerica’, ‘southamerica’
session_id
integer
Sticky session ID (0-1000000) for IP persistence
device
string
Device emulation (‘desktop’, ‘mobile’, ‘tablet’)
timeout
integer
Request timeout in ms (5000-120000)
retry_timeout
integer
Retry timeout in ms (5000-55000)
disable_retry
boolean
Disable automatic retry on failure
disable_redirection
boolean
Disable following redirects
block_resources
boolean
Block images, CSS, fonts to speed up loading
block_ads
boolean
Block advertisements
custom_headers
boolean
Let Scrape.do inject default headers
set_cookies
string
Cookies to send
play_with_browser
string
JSON-encoded Play-with-Browser script

Response

{
  "additionalProperties": false,
  "properties": {
    "success": {
      "title": "Success",
      "type": "boolean"
    },
    "error": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Error"
    },
    "status_code": {
      "anyOf": [
        {
          "type": "integer"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Status Code"
    },
    "markdown": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Markdown"
    },
    "raw": {
      "anyOf": [
        {
          "additionalProperties": true,
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Raw"
    }
  },
  "required": [
    "success"
  ],
  "title": "ScrapeToMarkdownOutput",
  "type": "object"
}

Response

{
  "additionalProperties": false,
  "properties": {
    "success": {
      "title": "Success",
      "type": "boolean"
    },
    "error": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Error"
    },
    "status_code": {
      "anyOf": [
        {
          "type": "integer"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Status Code"
    },
    "stats": {
      "anyOf": [
        {
          "additionalProperties": true,
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Stats"
    }
  },
  "required": [
    "success"
  ],
  "title": "GetUsageStatsOutput",
  "type": "object"
}

Limits & Quotas

  • Each scrape action exposes 20+ optional knobs (proxy routing, geo-targeting, device emulation, cookies, headers, wait conditions, viewport). All map to Scrape.do’s camelCase query string keys via a single _PARAM_MAP translation table.
  • take_screenshot is mutually-exclusive between viewport / full-page / element modes — the tool validates that full_page and selector aren’t both set.
  • Output shape varies per upstream response:
    • JSON → payload: dict
    • text/html/markdown → data: str with is_binary=False
    • image/* → data: <base64> with is_binary=True
  • 180s timeout for scrape operations (matches legacy); 30s for the usage-stats endpoint.

Ahrefs

Airweave

Apify