Reputation: 11
I want to implement rbac based auth in airflow with keycloak. Can someone help me with it. I have creaed the webserver.config file and I am using docker to up the airflow webserver.
from airflow.www_rbac.security import AirflowSecurityManager
from flask_appbuilder.security.manager import AUTH_OAUTH
import os
import json
AUTH_TYPE = AUTH_OAUTH
AUTH_USER_REGISTRATION_ROLE = "Admin"
OAUTH_PROVIDERS = [
{
'name': 'keycloak',
'icon': 'fa-user-circle',
'token_key': 'access_token',
'remote_app': {
'base_url': 'http://localhost:8180/auth/realms/airflow/protocol/openid-connect/',
'request_token_params': {
'scope': 'email profile'
},
'request_token_url': None,
'access_token_url': 'http://localhost:8180/auth/realms/airflow/protocol/openid-connect/token',
'authorize_url': 'http://localhost:8180/auth/realms/airflow/protocol/openid-connect/auth',
'consumer_secret': "98ec2e89-9902-4577-af8c-f607e34aa659"
}
}
]
I have also set the ariflow.cfg
rbac = True authenticate = True
But still its not redirecting to the keycloak when the airflow is loaded.
I use :
docker build --rm --build-arg AIRFLOW_DEPS="datadog,dask" --build-arg PYTHON_DEPS="flask_oauthlib>=0.9" -t airflow .
and
docker run -d -p 8080:8080 airflow webserver
TO execute it.
Upvotes: 1
Views: 3798
Reputation: 811
I maybe coming late to this one and my answer may not work exactly as I'm using a different auth provider, however it's still OAuth2 and in a previous life I used Keycloak so my solution should also work there.
My answer makes use of authlib (At time of writing newer versions of airflow have switched. I am on 2.1.2)
I've raised a feature request against Flask-AppBuilder which Airflow uses as it's OAuth hander should really take care of things when the scope includes openid
(you'd need to add this to your scopes)
From memory keycloak returns id_token
along side the access_token
and refresh_token
and so this code simply decodes what has already been returned.
import os
import logging
import re
import base64
import yaml
from flask import session
from airflow.www.security import AirflowSecurityManager
from flask_appbuilder.security.manager import AUTH_OAUTH
basedir = os.path.abspath(os.path.dirname(__file__))
MY_PROVIDER = 'keycloak'
class customSecurityiManager(AirflowSecurityManager):
def oauth_user_info(self, provider, resp):
if provider == MY_PROVIDER:
log.debug("{0} response received : {1}".format(provider,resp))
id_token = resp["id_token"]
log.debug(str(id_token))
me = self._azure_jwt_token_parse(id_token)
log.debug("Parse JWT token : {0}".format(me))
if not me.get("name"):
firstName = ""
lastName = ""
else:
firstName = me.get("name").split(' ')[0]
lastName = me.get("name").split(' ')[-1]
return {
"username": me.get("email"),
"email": me.get("email"),
"first_name": firstName,
"last_name": lastName,
"role_keys": me.get("groups", ['Guest'])
}
else:
return {}
log = logging.getLogger(__name__)
AUTH_TYPE = AUTH_OAUTH
AUTH_USER_REGISTRATION = True
AUTH_USER_REGISTRATION_ROLE = "Guest"
AUTH_ROLES_SYNC_AT_LOGIN = True
CSRF_ENABLED = True
AUTH_ROLES_MAPPING = {
"Airflow_Users": ["User"],
"Airflow_Admins": ["Admin"],
}
OAUTH_PROVIDERS = [
{
'name': MY_PROVIDER,
'icon': 'fa-circle-o',
'token_key': 'access_token',
'remote_app': {
'server_metadata_url': WELL_KNOWN_URL,
'client_id': CLIENT_ID,
'client_secret': CLIENT_SECRET,
'client_kwargs': {
'scope': 'openid groups',
'token_endpoint_auth_method': 'client_secret_post'
},
'access_token_method': 'POST',
}
}
]
SECURITY_MANAGER_CLASS = customSecurityManager
Ironically the Azure provider already returns id_token
and it's handled so my code makes use of that existing parsing
The code decodes id_token
Note you can turn on debug logging with the environmental variable AIRFLOW__LOGGING__FAB_LOGGING_LEVEL
set to DEBUG
.
If you switch on debug logs and see an entry like the following (note the id_token
) you can probably use the code I've supplied.
DEBUG - OAUTH Authorized resp: {'access_token': '<redacted>', 'expires_in': 3600, 'id_token': '<redacted>', 'refresh_token': '<redacted>, 'scope': 'openid groups', 'token_type': 'Bearer', 'expires_at': <redacted - unix timestamp>}
The id_token
is in 3 parts joined by a full stop .
The middle part contains the user data and is simply base64 encoded
Upvotes: 1