User Defined Functions in Python
Create and use a Python UDF on Databricks
Last updated
Create and use a Python UDF on Databricks
Last updated
In this tutorial we will
create a User Defined Function written in Python. We will use Databricks as a processing platform.
create a Data Policy that uses this UDF, and apply it to the the platform.
inspect the resulting SQL View.
show some results.
On Databricks this is quite simple. In SQL we would do the following. Make sure we execute it in the catalog and schema of the data.
NOTE: Make sure both the PACE service credential, as well as any user that might access the resulting SQL VIEW on Databricks has EXECUTE
permissions on the function.
We have used a demo table with an age
integer column in it, and downloaded a blueprint data policy:
We have then edited the policy file, and included the following field transformation:
So this field transformation defines that any user (principals: []
) receives the squared value of the age
column.
First upsert
the policy file to PACE.
And then actually apply it on the processing platform (alternatively, you can execute the upsert
command with the --apply
flag to immediately apply it).
If everything went right we would have a SQL VIEW named pace.alpha_test.demo_pace_view
with this view definition:
The original demo
table contains fairly normal ages:
But the view clearly shows them squared: