Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Body
maxResults and maxDepth will be ignored if useSitemap or entireWebsite is true
URL to start crawling from
"https://example.com"
Maximum number of results to collect
1 <= x <= 10000Maximum depth of pages to crawl
1 <= x <= 100Whether to use sitemap.xml to crawl the website. If true - maxResults and maxDepth will be ignored.
false
Whether to crawl the entire website. If true - maxResults and maxDepth will be ignored.
false
Whether to exclude non-main tags from the crawl results' markdown
true
Whether to include links in the crawl results' markdown
true
Whether to remove duplicate text fragments that appeared on other pages.
true
Instructions defining how the AI should extract specific content from the crawl results
"Extract only pricing info about the product"
The engine to use for the crawl. Auto: auto detect the best engine (default). Cheerio: fast, great for static websites. Playwright: great for dynamic websites that use JavaScript frameworks.
auto, cheerio, playwright "auto"
Whether to use static IPs for the crawl. This target website can whitelist the IPs to use for the crawl. The static IP will be 154.17.150.0
false
Timeout duration in minutes
x >= 6060
Response
Crawl object
URL to start crawling from
"https://example.com"
Maximum number of results to collect
1 <= x <= 110Maximum depth of pages to crawl
1 <= x <= 9910
Whether to use sitemap.xml to crawl the website. If true - maxResults and maxDepth will be ignored.
false
Whether to crawl the entire website. If true - maxResults and maxDepth will be ignored.
false
Whether to exclude non-main tags from the crawl results' markdown
true
Whether to include links in the crawl results' markdown
true
Whether to remove duplicate text fragments that appeared on other pages.
true
Instructions defining how the AI should extract specific content from the crawl results
"Extract only pricing info"
The engine to use for the crawl. Auto: auto detect the best engine (default). Cheerio: fast, great for static websites. Playwright: great for dynamic websites that use JavaScript frameworks.
auto, cheerio, playwright "auto"
Whether to use static IPs for the crawl. This target website can whitelist the IPs to use for the crawl. The static IP will be 154.17.150.0
false
Timeout duration in minutes Timeout duration in seconds
x >= 601800
Identification number of the crawl
"6870e36787c81925622df818"
Timestamp when the crawl was created
Current status of the crawl
running, succeeded, failed, aborted, timed_out, error "timed_out"
Timestamp when the crawl was completed
Duration of the crawl in seconds
x >= 01800
ID of the brand associated with the crawl
Array of URLs to start crawling from
["https://example.com"]Count of pages extracted
x >= 0100
Origin of the crawl request
api, web "web"