User Defined Functions in Python
Create and use a Python UDF on Databricks
In this tutorial we will
create a User Defined Function written in Python. We will use Databricks as a processing platform.
create a Data Policy that uses this UDF, and apply it to the the platform.
inspect the resulting SQL View.
show some results.
Creating the UDF
On Databricks this is quite simple. In SQL we would do the following. Make sure we execute it in the catalog and schema of the data.
Create a Data Policy
We have used a demo table with an age
integer column in it, and downloaded a blueprint data policy:
We have then edited the policy file, and included the following field transformation:
So this field transformation defines that any user (principals: []
) receives the squared value of the age
column.
Create the SQL VIEW on Databricks
First upsert
the policy file to PACE.
And then actually apply it on the processing platform (alternatively, you can execute the upsert
command with the --apply
flag to immediately apply it).
Investigate the results
If everything went right we would have a SQL VIEW named pace.alpha_test.demo_pace_view
with this view definition:
The original demo
table contains fairly normal ages:
But the view clearly shows them squared:
Last updated