Launches a headless browser and scrapes the data from a website.
Currently Workflows supports the following extractions:
cookies
Store the cookies that were collected at the end of the (headless) browser session.
property | type | required | description |
---|---|---|---|
type | enumerator (see description) | yes | Must be cookies .
|
actions | array | yes | Array of actions to perform. Supports both url and click . See below for more details per action type.
|
property | type | required | description |
---|---|---|---|
type | enumerator (url) | yes | Must be url .
|
value | string | yes | The url to navigate to. |
property | type | required | description |
---|---|---|---|
type | enumerator (click) | yes | Must be click .
|
by | enumerator (class_name, css_selector, id, link_text, name, partial_link_text, tag_name, xpath) | yes | What method should be used to find the element to click on.
|
value | string | yes | Element specific value that should be used to find the element to click on. |
extract:
type: local_storage
actions:
- type: url
value: https://www.onesecondbefore.com
- type: click
by: id
value: btn-f82hf-allow
- type: url
value: https://www.onesecondbefore.com
- type: url
value: https://www.onesecondbefore.com/contact/
- type: url
value: https://www.onesecondbefore.com/resources/
item | description |
---|---|
Pre-formatted schema | Yes. This from task comes with a pre-formatted schema. Schema depends on the type. |
Used Technology | The from task uses Selenium and a headleass Chrome browser to scrape the data. |