Comment on page
Schema
The source's data structure
The first step to defining a
Data Policy
is knowing what your source data looks like. This source data most likely will live in a Data Platform
or Data Catalog
. But you will also be able to define the structure yourself. Below we demonstrate the different options to define your schemaBelow we will talk about getting a blueprint policy. A blueprint policy is a
Data Policy
where only the source ref and fields, and potentially a ruleset are populated. This serves as a starting point for defining the rest of the Data Policy
. A ruleset can be present in the blueprint policy, but this depends on whether global transforms are defined. A blueprint policy is retrieved from either a Data Catalog or a Processing PlatformA blueprint policy consists of metadata such as a title, version, create time and last updated time as well as user defined tags. It has information about the processing platform, being its type and the configured id. But most importantly it contains the fields, or schema, of the source data. Each field consists of an array of
name_parts
, which is the path to field and typically contains only one entry for columnar/flat data. Furthermore, it contains the type, whether or not it is a required field and user defined tags. Lastly the source section contains a reference to the source table and again user defined tags for the source.YAML
JSON
1
data_policy:
2
rule_sets: []
3
id: ""
4
metadata:
5
tags: []
6
title: SCHEMA.TABLE
7
version: ""
8
create_time: null
9
update_time: null
10
source:
11
fields:
12
- name_parts:
13
- TRANSACTIONID
14
tags: []
15
type: numeric
16
required: true
17
- name_parts:
18
- USERID
19
tags: []
20
type: varchar
21
required: true
22
- name_parts:
23
- EMAIL
24
tags: []
25
type: varchar
26
required: true
27
- name_parts:
28
- AGE
29
tags: []
30
type: numeric
31
required: true
32
- name_parts:
33
- BRAND
34
tags: []
35
type: varchar
36
required: true
37
- name_parts:
38
- TRANSACTIONAMOUNT
39
tags: []
40
type: numeric
41
required: true
42
tags: []
43
ref: SCHEMA.TABLE
44
platform:
45
platform_type: SNOWFLAKE
46
id: snowflake-demo-connection
1
{
2
"data_policy": {
3
"rule_sets": [],
4
"id": "",
5
"metadata": {
6
"tags": [],
7
"title": "SCHEMA.TABLE",
8
"version": "",
9
"create_time": null,
10
"update_time": null
11
},
12
"source": {
13
"fields": [
14
{
15
"name_parts": [
16
"TRANSACTIONID"
17
],
18
"tags": [],
19
"type": "NUMBER(38,0)",
20
"required": true
21
},
22
{
23
"name_parts": [
24
"USERID"
25
],
26
"tags": [],
27
"type": "VARCHAR(16777216)",
28
"required": true
29
},
30
{
31
"name_parts": [
32
"EMAIL"
33
],
34
"tags": [],
35
"type": "VARCHAR(16777216)",
36
"required": true
37
},
38
{
39
"name_parts": [
40
"AGE"
41
],
42
"tags": [],
43
"type": "NUMBER(38,0)",
44
"required": true
45
},
46
{
47
"name_parts": [
48
"BRAND"
49
],
50
"tags": [],
51
"type": "VARCHAR(16777216)",
52
"required": true
53
},
54
{
55
"name_parts": [
56
"TRANSACTIONAMOUNT"
57
],
58
"tags": [],
59
"type": "NUMBER(38,0)",
60
"required": true
61
}
62
],
63
"tags": [],
64
"ref": "SCHEMA.TABLE"
65
},
66
"platform": {
67
"platform_type": "SNOWFLAKE",
68
"id": "snowflake-demo-connection"
69
}
70
}
71
}
If your
Data Platform
(or Processing Platform
) has knowledge of the source's data structure, we provide both a [REST API](../reference/api-reference. md#processing-platforms-platformid-tables-table_id-blueprint-policy) and a CLI to receive a blueprint policy. Find out what the minimum required permissions are per Processing Platform
in our processing platform integration pages.The source's data structure can also be retrieved from a
Data Catalog
. Here too we provide both a [REST API](../reference/api-reference. md#catalogs-catalogid-databases-databaseid-schemas-schemaid-tables-tableid-blueprint-policy) and a CLI to receive the blueprint policy. Find out what the minimum required permissions are per Data Catalog
in our data catalog integration pages.Last modified 15d ago