Reputation: 85
I need to implement row level security in my tables present inside databricks such that employees can only see data from their respective branches when they query the view. Also, I need to make sure that the user belongs to a specific group in databricks (The Admin Console Groups). So the task would be, first check if the user belongs to a databricks admin console group, then check its branch and then return the entries. I have the following tables
Employee_Details
EmpId EmpDetailsCols
1 xyz
2 xyz
3 xyz
4 xyz
5 xyz
6 xyz
7 xyz
Employee_Branch_Details
EmpId EmpBranch
1 L1
2 L2
3 L3
4 L2
5 L2
6 L1
7 L3
// These are the group names for each branches.
Databricks_Groups_Details
DatabricksGroup EmpBrach
DBGrp1_L1 L1
DBGrp1_L2 L2
DBGrp1_L3 L3
DBGrpH_H1 #
My approach to solve the problem is:
Create or replace view myview
as
select ed.*
from
employee_details ed join employee_branch_details ebd
on ed.empid = ebd.empid
and
(
(
IS_MEMBER('DBGrp1_L1')
AND ebd.empbranch = 'L1'
)
OR
(
IS_MEMBER('DBGrp1_L2')
AND ebd.empbranch = 'L2'
)
OR
(
IS_MEMBER('DBGrp1_L3')
AND ebd.empbranch = 'L3'
)
OR
(
IS_MEMBER('DBGrpH_H1)
)
Note: The role "DBGrpH_H1"
should be able to see all the data because the branch for it is specified as "#"
, which means he has complete access.
This approach gives me the desired result but the problem is if in the future I have to add more groups, I'll have to manually come and change the view again and again and the code can get very lengthy. Is there any way where I can loop through the rows of my Databricks_Groups_Details tables and use it instead of typing all the roles manually, for eg, something like
Create or replace view myview
as
select ed.*, ebd.EmpBranch
from
employee_details ed join employee_branch_details ebd
on ed.empid = ebd.empid
/// some logic to loop through DatabricksGroup column and match ebd.EmpBranch with databricks_group_details.empbranch
Upvotes: 1
Views: 1146
Reputation: 87259
The IS_MEMBER
function can take the group name from the data itself, not necessary to use hardcoded group names. In this case, you just need to do one more join - with the Databricks_Groups_Details
so you can pass group name as parameter to that function. Something like that (not tested):
Create or replace view myview
as
select ed.*
from
employee_details ed join employee_branch_details ebd
on ed.empid = ebd.empid
join Databricks_Groups_Details dgd
on ebd.EmpBranch = dgd.EmpBranch
and IS_MEMBER(dgd.DatabricksGroup)
Upvotes: 1