Rana
Rana

Reputation: 85

How to implement row level security in Azure Databricks using role data from a table present inside the database?

I need to implement row level security in my tables present inside databricks such that employees can only see data from their respective branches when they query the view. Also, I need to make sure that the user belongs to a specific group in databricks (The Admin Console Groups). So the task would be, first check if the user belongs to a databricks admin console group, then check its branch and then return the entries. I have the following tables

Employee_Details
EmpId   EmpDetailsCols
1        xyz
2        xyz
3        xyz
4        xyz 
5        xyz
6        xyz
7        xyz

Employee_Branch_Details
EmpId EmpBranch
1       L1    
2       L2    
3       L3    
4       L2
5       L2
6       L1
7       L3

// These are the group names for each branches.
Databricks_Groups_Details                    
DatabricksGroup     EmpBrach
DBGrp1_L1            L1
DBGrp1_L2            L2
DBGrp1_L3            L3
DBGrpH_H1             #

My approach to solve the problem is:

Create or replace view myview
as
select ed.*
from
employee_details ed join employee_branch_details ebd
on ed.empid = ebd.empid
and
(
(
IS_MEMBER('DBGrp1_L1')
AND ebd.empbranch = 'L1'
)
OR
(
IS_MEMBER('DBGrp1_L2')
AND ebd.empbranch = 'L2'
)
OR
(
IS_MEMBER('DBGrp1_L3')
AND ebd.empbranch = 'L3'
)
OR
(
IS_MEMBER('DBGrpH_H1)
)

Note: The role "DBGrpH_H1" should be able to see all the data because the branch for it is specified as "#", which means he has complete access.

This approach gives me the desired result but the problem is if in the future I have to add more groups, I'll have to manually come and change the view again and again and the code can get very lengthy. Is there any way where I can loop through the rows of my Databricks_Groups_Details tables and use it instead of typing all the roles manually, for eg, something like

Create or replace view myview
as
select ed.*, ebd.EmpBranch
from
employee_details ed join employee_branch_details ebd
on ed.empid = ebd.empid
/// some logic to loop through DatabricksGroup column and match ebd.EmpBranch with databricks_group_details.empbranch

Upvotes: 1

Views: 1146

Answers (1)

Alex Ott
Alex Ott

Reputation: 87259

The IS_MEMBER function can take the group name from the data itself, not necessary to use hardcoded group names. In this case, you just need to do one more join - with the Databricks_Groups_Details so you can pass group name as parameter to that function. Something like that (not tested):

Create or replace view myview
as
select ed.*
from
employee_details ed join employee_branch_details ebd
on ed.empid = ebd.empid
join Databricks_Groups_Details dgd
on ebd.EmpBranch = dgd.EmpBranch
and IS_MEMBER(dgd.DatabricksGroup)

Upvotes: 1

Related Questions