Faris AbdulRaheem
Faris AbdulRaheem

Reputation: 11

Airflow authentication with RBAC and Key cloak

I want to implement rbac based auth in airflow with keycloak. Can someone help me with it. I have creaed the webserver.config file and I am using docker to up the airflow webserver.

 from airflow.www_rbac.security import AirflowSecurityManager
    from flask_appbuilder.security.manager import AUTH_OAUTH
    import os
    import json
    AUTH_TYPE = AUTH_OAUTH
    
    
    
    AUTH_USER_REGISTRATION_ROLE = "Admin"
    OAUTH_PROVIDERS = [
    {
           'name': 'keycloak',
           'icon': 'fa-user-circle',
           'token_key': 'access_token',
           'remote_app': {
                'base_url': 'http://localhost:8180/auth/realms/airflow/protocol/openid-connect/',
                'request_token_params': {
                  'scope': 'email profile'
                },
                'request_token_url': None,
                'access_token_url': 'http://localhost:8180/auth/realms/airflow/protocol/openid-connect/token',
                'authorize_url': 'http://localhost:8180/auth/realms/airflow/protocol/openid-connect/auth',
                'consumer_secret': "98ec2e89-9902-4577-af8c-f607e34aa659"
            }
        }
    ]

I have also set the ariflow.cfg

rbac = True authenticate = True

But still its not redirecting to the keycloak when the airflow is loaded.

I use :

 docker build --rm --build-arg AIRFLOW_DEPS="datadog,dask" --build-arg PYTHON_DEPS="flask_oauthlib>=0.9" -t airflow .

and

docker run -d -p 8080:8080 airflow webserver

TO execute it.

Upvotes: 1

Views: 3798

Answers (1)

Timothy c
Timothy c

Reputation: 811

I maybe coming late to this one and my answer may not work exactly as I'm using a different auth provider, however it's still OAuth2 and in a previous life I used Keycloak so my solution should also work there.

My answer makes use of authlib (At time of writing newer versions of airflow have switched. I am on 2.1.2)

I've raised a feature request against Flask-AppBuilder which Airflow uses as it's OAuth hander should really take care of things when the scope includes openid (you'd need to add this to your scopes)

From memory keycloak returns id_token along side the access_token and refresh_token and so this code simply decodes what has already been returned.

import os
import logging
import re
import base64
import yaml
from flask import session
from airflow.www.security import AirflowSecurityManager
from flask_appbuilder.security.manager import AUTH_OAUTH
basedir = os.path.abspath(os.path.dirname(__file__))

MY_PROVIDER = 'keycloak'

class customSecurityiManager(AirflowSecurityManager):
  def oauth_user_info(self, provider, resp):
    if provider == MY_PROVIDER:
      log.debug("{0} response received : {1}".format(provider,resp))
      id_token = resp["id_token"]
      log.debug(str(id_token))
      me = self._azure_jwt_token_parse(id_token)
      log.debug("Parse JWT token : {0}".format(me))

      if not me.get("name"):
        firstName = ""
        lastName  = ""
      else:
        firstName = me.get("name").split(' ')[0]
        lastName  = me.get("name").split(' ')[-1]
      return {
        "username":   me.get("email"),
        "email":      me.get("email"),
        "first_name": firstName,
        "last_name":  lastName,
        "role_keys":  me.get("groups", ['Guest'])
      }
    else:
      return {}

log = logging.getLogger(__name__)

AUTH_TYPE = AUTH_OAUTH
AUTH_USER_REGISTRATION = True
AUTH_USER_REGISTRATION_ROLE = "Guest"
AUTH_ROLES_SYNC_AT_LOGIN = True
CSRF_ENABLED = True

AUTH_ROLES_MAPPING = {
    "Airflow_Users": ["User"],
    "Airflow_Admins": ["Admin"],
}

OAUTH_PROVIDERS = [
  {
    'name': MY_PROVIDER,
    'icon': 'fa-circle-o',
    'token_key': 'access_token',
    'remote_app': {
      'server_metadata_url':  WELL_KNOWN_URL,
      'client_id':            CLIENT_ID,
      'client_secret':        CLIENT_SECRET,
      'client_kwargs': {
        'scope': 'openid groups',
        'token_endpoint_auth_method': 'client_secret_post'
      },
      'access_token_method': 'POST',
    }
  }
]
SECURITY_MANAGER_CLASS = customSecurityManager

Ironically the Azure provider already returns id_token and it's handled so my code makes use of that existing parsing

The code decodes id_token Note you can turn on debug logging with the environmental variable AIRFLOW__LOGGING__FAB_LOGGING_LEVEL set to DEBUG.

If you switch on debug logs and see an entry like the following (note the id_token) you can probably use the code I've supplied.

DEBUG - OAUTH Authorized resp: {'access_token': '<redacted>', 'expires_in': 3600, 'id_token': '<redacted>', 'refresh_token': '<redacted>, 'scope': 'openid groups', 'token_type': 'Bearer', 'expires_at': <redacted - unix timestamp>}

The id_token is in 3 parts joined by a full stop . The middle part contains the user data and is simply base64 encoded

Upvotes: 1

Related Questions