The task with type do_views, is the task that handles view management. It will look for a directory in your repository with the name: views/. Please save all view files in the following way:
DATABASE OR PROJECT_ID/SCHEMA OR DATASET_ID/VIEWNAME.sql
include
statement. The root of the statement will be the templates
directory in your repository.The views are created with the following procedure:
mapping
parameter is used to replace the database folder name with the destination database/project id. This is needed meant to be able to create views in the production and sandbox databases._tmp_f7f87eh
) in the destination database/project and schema/dataset, to check if it contains any errors. If there was an error creating the view, this will be shown in the logs and the view will be retried in a next iteration, since it might depend on a change in any of the other views in the same update batch.You can set certain variables in your view that impact view creation behavior. These are used to create e.g. materialized views and set the clustering. Please refer to the Jinja2 documentation on how to set variables. The standard set of variables from the configuration YAML that you can also use in your templates can be used in the views. These are: transfer, job, task, load and client_cloud.
variable | type | database | description |
---|---|---|---|
osb_materialized | boolean | Snowflake and BigQuery | Don't set or set to false for a standard view. Set to true if the view must be materialized. |
osb_partition_by | string | BigQuery | Only use if osb_materialized=true . Set to what you would normally set after PARTITION BY. |
osb_cluster_by | string | Snowflake and BigQuery | Only use if osb_materialized=true . Set to what you would normally set after CLUSTER BY. |
osb_schedule_label | string | Snowflake and BigQuery | Label should match the deduplication.label in the configuration. This way Workflows knows which views to replace. Handy if you need to have your views updated for example at different moments during the day. |
{%- set osb_schedule_label = 'daily' -%}
{%- set osb_materialized = true -%}
{%- set osb_partition_by = 'DATE(dstamp)' -%}
{%- set osb_cluster_by = 'site_id' -%}
SELECT TIMESTAMP_TRUNC(tstamp, DAY) AS dstamp, site_id, hit_type, COUNT(1) AS hits
FROM `my-production-project.osb_analytics.hits`
WHERE DATE(tstamp) >= DATE('{{ task.trigger_date.strftime('%Y-%m-%d') }}')-14
AND DATE(tstamp) < DATE('{{ task.trigger_date.strftime('%Y-%m-%d') }}')
GROUP BY 1,2,3
The properties below can be used at root level.
The following properties can be set under the task
parameter.
property | type | required | description |
---|---|---|---|
type | enumerator(do_views) | yes | Contains the task type. Must be `do_views`. |
task:
type: do_views
The following properties can be set under the deduplicate
parameter.
property | type | required | description |
---|---|---|---|
type | enumerator(replace, replace_always, replace_scheduled) | yes | replace = Replace view when a new version is checked in the repository. This type of deduplication is usually used after a deploy task.always_replace = Always replaces all views. This type of deduplication is usually used in Sandbox mode.schedule_replace = Use this when you want to have views replaced on e.g. a daily schedule. To make it work, you should add the osb_schedule_label view configuration variable to every view you want to have Workflows replaced on a schedule. You should also make sure that the value of the view configuration variable osb_schedule_label is the same as deduplication.label . This way you can have different tasks at different schedules updating different views. This type of deduplication is usually used for views that contain date filters that have to be updated on a daily basis. |
label | string | no | Only needed when deduplication.type=schedule_replace . Contains the label that the view configuration property have to be set to, in the view template.
|
deduplicate:
type: schedule_replace
label: daily
The following properties can be set under the load
parameter.
property | type | required | description |
---|---|---|---|
mapping | object | yes | Contains the mappings from the folder in the views directory to the database (in Snowflake) or project id (in BigQuery). We recommend you to keep the local folder name the same as the destination database/project in production. |
load:
mapping:
onesecondbefore-gcloud: onesecondbefore-gcloud-sandbox
Link to create view documentation |
---|
BigQuery documentation |
Snowflake documentation |