Databricks
Connecting PACE to a Databricks workspace
Service principal creation and privileges
PACE leverages Databricks' Unity Catalog to create dynamic views based on Data Policies. A few steps are required to connect a PACE instance with a Databricks workspace.
Create a service principal, e.g.
pace-user
through Databricks' Account Console.On the new principal's information page, generate an OAuth secret. Make sure to copy the secret's value. Additional secrets can be generated anytime.
On the Roles tab, enable the Account admin role. This permission is required for PACE to retrieve the available groups, for listing and Data Policy validation purposes.
Next, grant the User permission to the principal on your desired Databricks Workspace, also through the Account Console.
In your workspace, either create a new SQL Warehouse, or choose an existing one, and grant usage permission on it to the service principal. Its size can be very small, as it is only used to create views or list tables.
For PACE to be able to list source tables and apply Data Policies through dynamic views, the service principal requires the
USE CATALOG
,USE SCHEMA
andSELECT
privileges on all desired source resources. The principal requiresCREATE TABLE
privileges on all target schemas where Data Policy views are to be created.If one wishes to use User Defined Functions, the PACE service principal, as well as any user of views where the UDF is used, require an
EXECUTE
permission on the function. See the UDF tutorial for more detail.
PACE application properties
After following the above steps, provide the corresponding configuration to the PACE application for each Databricks workspace you want to connect with. For example:
The properties are expected to contain the following:
id
: an arbitrary identifier unique within your organization for the specific platform (Databricks).workspaceHost
: the URL to your Databricks workspace, containing its unique deployment name.accountHost
: typicallyhttps://accounts.cloud.databricks.com
.accountId
: the id of the Databricks account that owns the workspace.clientId
: the client id of the generated OAuth secret for the service principal to be used by PACE.clientSecret
: the secret value of this generated OAuth secret.warehouseId
: the id of the SQL Warehouse to be used by PACE.
Last updated