Reputation: 79
I'm trying to deploy a HA Keycloak cluster (2 nodes) on Kubernetes (GKE). So far the cluster nodes (pods) are failing to discover each other in all the cases as of what I deduced from the logs. Where the pods initiate and the service is up but they fail to see other nodes.
Components
Logs Snippet:
INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-4) ISPN000078: Starting JGroups channel ejb
INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-4) ISPN000094: Received new cluster view for channel ejb: [keycloak-567575d6f8-c5s42|0] (1) [keycloak-567575d6f8-c5s42]
INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-1) ISPN000094: Received new cluster view for channel ejb: [keycloak-567575d6f8-c5s42|0] (1) [keycloak-567575d6f8-c5s42]
INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-3) ISPN000094: Received new cluster view for channel ejb: [keycloak-567575d6f8-c5s42|0] (1) [keycloak-567575d6f8-c5s42]
INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-4) ISPN000079: Channel ejb local address is keycloak-567575d6f8-c5s42, physical addresses are [127.0.0.1:55200]
.
.
.
INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0025: Keycloak 15.0.2 (WildFly Core 15.0.1.Final) started in 67547ms - Started 692 of 978 services (686 services are lazy, passive or on-demand)
INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0060: Http management interface listening on http://127.0.0.1:9990/management
INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0051: Admin console listening on http://127.0.0.1:9990
And as we can see in the above logs the node sees itself as the only container/pod ID
I tried using the kubernetes.KUBE_PING protocol for discovery but it didn't work and the call to the kubernetes downward API. With a 403 Authorization error in the logs (BELOW IS PART OF IT):
Server returned HTTP response code: 403 for URL: https://[SERVER_IP]:443/api/v1/namespaces/default/pods
At this point, I was able to log in to the portal and do the changes but it was not yet an HA cluster since changes were not replicated and the session was not preserved, in other words, if I delete the pod that I was using I was redirected to the other with a new session (as if it was a separate node)
When I tried DNS_PING things were different I had no Kubernetes downward API issues but I was not able to log in.
In detail, I was able to reach the login page normally, but when I enter my credentials and try logging in the page tries loading but gets me back to the login page with no logs in the pods in this regard.
Below are some of the references I resorted to over the past couple of days:
Postgresql Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres
spec:
replicas: 1
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:13
imagePullPolicy: IfNotPresent
ports:
- containerPort: 5432
env:
- name: POSTGRES_PASSWORD
value: "postgres"
- name: PGDATA
value: /var/lib/postgresql/data/pgdata
---
apiVersion: v1
kind: Service
metadata:
name: postgres
spec:
selector:
app: postgres
ports:
- port: 5432
targetPort: 5432
Keycloak HA cluster Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: keycloak
labels:
app: keycloak
spec:
replicas: 2
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
selector:
matchLabels:
app: keycloak
template:
metadata:
labels:
app: keycloak
spec:
containers:
- name: keycloak
image: jboss/keycloak
env:
- name: KEYCLOAK_USER
value: admin
- name: KEYCLOAK_PASSWORD
value: admin123
- name: DB_VENDOR
value: POSTGRES
- name: DB_ADDR
value: "postgres"
- name: DB_PORT
value: "5432"
- name: DB_USER
value: "postgres"
- name: DB_PASSWORD
value: "postgres"
- name: DB_SCHEMA
value: "public"
- name: DB_DATABASE
value: "keycloak"
# - name: JGROUPS_DISCOVERY_PROTOCOL
# value: kubernetes.KUBE_PING
# - name: JGROUPS_DISCOVERY_PROPERTIES
# value: dump_requests=true,port_range=0,namespace=default
# value: port_range=0,dump_requests=true
- name: JGROUPS_DISCOVERY_PROTOCOL
value: dns.DNS_PING
- name: JGROUPS_DISCOVERY_PROPERTIES
value: "dns_query=keycloak"
- name: CACHE_OWNERS_COUNT
value: '2'
- name: CACHE_OWNERS_AUTH_SESSIONS_COUNT
value: '2'
- name: PROXY_ADDRESS_FORWARDING
value: "true"
ports:
- name: http
containerPort: 8080
- name: https
containerPort: 8443
---
apiVersion: v1
kind: Service
metadata:
name: keycloak
labels:
app: keycloak
spec:
type: ClusterIP
ports:
- name: http
port: 80
targetPort: 8080
- name: https
port: 443
targetPort: 8443
selector:
app: keycloak
---
apiVersion: v1
kind: Service
metadata:
name: keycloak-np
labels:
app: keycloak
spec:
type: LoadBalancer
ports:
- name: http
port: 80
targetPort: 8080
- name: https
port: 443
targetPort: 8443
selector:
app: keycloak
Upvotes: 3
Views: 8898
Reputation: 1815
By default, those versions use DNS_PING as the discovery mechanism for JGroups (the underlying cluster mechanism) but you still need to activate it.
You'll need:
ClusterIP: none
)KC_CACHE_STACK=kubernetes
(to activate the kubernetes jgroup configs) and JAVA_OPTS_APPEND=-Djgroups.dns.query=<name-of-headless-service>
(to tell it how to find the other keycloak pods).That way, when starting up, jgroups will issue a dns query for (example: keycloak-headless.my_namespace.svc.cluster.local) and the response will be the IP of all pods associated to the headless service.
JGroups will then contact every IP in communication port and stablish the cluster.
UPDATE 2022-08-01: This configuration below is for the legacy version of keycloak (or versions up to 16). From 17 on Keycloak migrated to the Quarkus distribution and the configuration is different, as above.
The way KUBE_PING works is similar to running kubectl get pods
inside one Keycloak pod to find the other Keycloak pods' IPs and then trying to connect to them one by one. However, Keycloak does this by querying the Kubernetes API directly instead of using kubectl
.
To access the Kubernetes API, Keycloak needs credentials in the form of an access token. You can pass your token directly, but this is not very secure or convenient.
Kubernetes has a built-in mechanism for injecting a token into a pod (or the software running inside that pod) to allow it to query the API. This is done by creating a service account, giving it the necessary permissions through a RoleBinding, and setting that account in the pod configuration.
The token is then mounted as a file at a known location, which is hardcoded and expected by all Kubernetes clients. When the client wants to call the API, it looks for the token at that location.
You can get a deeper look at the Service Account mechanism in the documentation.
In some situations, you may not have the necessary permissions to create RoleBindings. In this case, you can ask an administrator to create the service account and RoleBinding for you or pass your own user's token (if you have the necessary permissions) through the SA_TOKEN_FILE environment variable.
You can create the file using a secret or configmap, mount it to the pod, and set SA_TOKEN_FILE to the file location. Note that this method is specific to JGroups library (used by Keycloak) and the documentation is here.
If you do have permissions to create service accounts and RoleBindings in the cluster:
An example (not tested):
export TARGET_NAMESPACE=default
# convenient method to create a service account
kubectl create serviceaccount keycloak-kubeping-service-account -n $TARGET_NAMESPACE
# No convenient method to create Role and RoleBindings
# Needed to explicitly define them.
cat <<EOF | kubectl apply -f -
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: keycloak-kubeping-pod-reader
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
name: keycloak-kubeping-api-access
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: keycloak-kubeping-pod-reader
subjects:
- kind: ServiceAccount
name: keycloak-kubeping-service-account
namespace: $TARGET_NAMESPACE
EOF
On the deployment, you set the serviceAccount:
apiVersion: apps/v1
kind: Deployment
metadata:
name: keycloak
spec:
template:
spec:
serviceAccount: keycloak-kubeping-service-account
serviceAccountName: keycloak-kubeping-service-account
containers:
- name: keycloak
image: jboss/keycloak
env:
# ...
- name: JGROUPS_DISCOVERY_PROTOCOL
value: kubernetes.KUBE_PING
- name: JGROUPS_DISCOVERY_PROPERTIES
value: dump_requests=true
- name: KUBERNETES_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
# ...
dump_requests=true
will help you debug Kubernetes requests. Better to have it false
in production. You can use namespace=<yournamespace
instead of KUBERNETES_NAMESPACE
, but that's a handy way the pod has to autodetect the namespace it's running at.
Please note that KUBE_PING will find all pods in the namespace, not only keycloak pods, and will try to connect to all of them. Of course, if your other pods don't care about that, it's OK.
Upvotes: 11
Reputation: 164
After long time with it, the best is using JDBC_PING, which fits also under a K8s environment. This procedure fits with Keycloak and a separated infinispan cluster too.
A basic approach can be found here https://github.com/thomasdarimont/keycloak-project-example/blob/main/deployments/local/cluster/haproxy-database-ispn/cli/0300-onstart-setup-ispn-jdbc-store.cli
What I suggest is generating a CLI script that run on start up or use the following env var. You'll need a database to persist data and the nodes will register there. It works in all environments.
Feel free to use the repo I generated which contains the whole feature for a clustered environment under mysql https://github.com/albertoSoto/keycloak-infinispan-cluster
Upvotes: 1