Manu Chadha
Manu Chadha

Reputation: 16729

How do I access my Cassandra/Kubernetes cluster from outside the cluster?

I have started using Cass-Operator and the setup worked like a charm! https://github.com/datastax/cass-operator.

I have an issue though. My cluster is up and running on GCP. But how do I access it from my laptop (basically from outside)? Sorry, I'm new to Kubernetes so I do not know how to access the cluster from outside?

I can see the nodes are up on the GCP dashboard. I can ping the external IP of the nodes from my laptop but when I run cqlsh external_ip 9042 then the connection fails.

How do I go about connecting the K8s/Cassandra cluster to outside work so that my web application can access it?

I would like to:

  1. have a url so that my web application uses that URL to connect to the cassandra/K8s cluster instead of IP address. Thus, I need a dns. Does it come by default in K8S? Would would be the url? Would K8s managing the dns mapping for me in some nodes get restarted?
  2. My web application should be able to reach Cassandra on 9042. It seems load balancing is done for http/https. The Cassandra application is not a http/https request. So I don't need port 80 or 443

I have read few tutorials which talk about Service, Loadbalancer and Ingress. But I am unable to make a start.

I created a service like this

kind: Service
apiVersion: v1
metadata:
  name: cass-operator-service
spec:
  type: LoadBalancer
  ports:
    - port: 9042
  selector:
    name: cass-operator

Then created the service - kubectl apply -f ./cass-operator-service.yaml

I checked if the service was created using kubectl get svc and got output

NAME                    TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)          AGE 
cass-operator-service   LoadBalancer   10.51.249.224   34.91.214.233   9042:30136/TCP   4m17s 
kubernetes              ClusterIP      10.51.240.1     <none>          443/TCP          10h. 

But when I run cqlsh 34.91.214.233 9042 then the connection fails

It seems that the requests to port 9042 would be forwarded to 30136. But They should be forwarded to 9042 as that is where the Cassandra image in the pods is listening for incoming requests

UPDATE

Tried targetPort but still no luck

manuchadha25@cloudshell:~ (copper-frame-262317)$ cat cass-operator-service.yaml
kind: Service
apiVersion: v1
metadata:
  name: cass-operator-service
spec:
  type: LoadBalancer
  ports:
    - port: 9042
      targetPort: 9042
  selector:
    name: cass-operator
manuchadha25@cloudshell:~ (copper-frame-262317)$ kubectl get service
NAME         TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.51.240.1   <none>        443/TCP   11h
manuchadha25@cloudshell:~ (copper-frame-262317)$ kubectl apply -f ./cass-operator-service.yaml
service/cass-operator-service created
manuchadha25@cloudshell:~ (copper-frame-262317)$ kubectl get service
NAME                    TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
cass-operator-service   LoadBalancer   10.51.255.184   <pending>     9042:30024/TCP   12s
kubernetes              ClusterIP      10.51.240.1     <none>        443/TCP          11h
manuchadha25@cloudshell:~ (copper-frame-262317)$ kubectl get service
NAME                    TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
cass-operator-service   LoadBalancer   10.51.255.184   <pending>     9042:30024/TCP   37s
kubernetes              ClusterIP      10.51.240.1     <none>        443/TCP          11h
manuchadha25@cloudshell:~ (copper-frame-262317)$ kubectl get service
NAME                    TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)          AGE
cass-operator-service   LoadBalancer   10.51.255.184   34.91.214.233   9042:30024/TCP   67s
kubernetes              ClusterIP      10.51.240.1     <none>          443/TCP          11h
manuchadha25@cloudshell:~ (copper-frame-262317)$ ping 34.91.214.233
PING 34.91.214.233 (34.91.214.233) 56(84) bytes of data.
64 bytes from 34.91.214.233: icmp_seq=1 ttl=109 time=7.89 ms

enter image description here

Querying all names spaces reveal the following

enter image description here

But querying pods with namespace cass-operator returns empty result

manuchadha25@cloudshell:~ (copper-frame-262317)$ kubectl get pods -l name=cass-operator
No resources found in default namespace.

Upvotes: 1

Views: 2958

Answers (2)

Will R.O.F.
Will R.O.F.

Reputation: 4108

  • Since you are new to Kubernetes, you probably are not familiar with StatefulSets:

StatefulSet is the workload API object used to manage stateful applications.

Manages the deployment and scaling of a set of Pods, and provides guarantees about the ordering and uniqueness of these Pods.

Like a Deployment, a StatefulSet manages Pods that are based on an identical container spec. Unlike a Deployment, a StatefulSet maintains a sticky identity for each of their Pods. These pods are created from the same spec, but are not interchangeable: each has a persistent identifier that it maintains across any rescheduling.


How do I go about connecting the K8s/Cassandra cluster to outside work so that my web application can access it?

  • I found out that datastax/cass-operator is still developing their documentation, I found this document that is not merged to master yet, but it explains very well about how to connect to Cassandra, I strongly recommend reading.
  • There are several open issues for documenting methods for connection from outside the cluster.

I followed the guide in https://github.com/datastax/cass-operator to deploy the cass-operator + Cassandra Datacenter Example as from your images I believe you followed as well:

$ kubectl create -f https://raw.githubusercontent.com/datastax/cass-operator/v1.2.0/docs/user/cass-operator-manifests-v1.15.yaml
namespace/cass-operator created
serviceaccount/cass-operator created
secret/cass-operator-webhook-config created
customresourcedefinition.apiextensions.k8s.io/cassandradatacenters.cassandra.datastax.com created
clusterrole.rbac.authorization.k8s.io/cass-operator-cluster-role created
clusterrolebinding.rbac.authorization.k8s.io/cass-operator created
role.rbac.authorization.k8s.io/cass-operator created
rolebinding.rbac.authorization.k8s.io/cass-operator created
service/cassandradatacenter-webhook-service created
deployment.apps/cass-operator created
validatingwebhookconfiguration.admissionregistration.k8s.io/cassandradatacenter-webhook-registration created

$ kubectl create -f https://raw.githubusercontent.com/datastax/cass-operator/v1.2.0/operator/k8s-flavors/gke/storage.yaml
storageclass.storage.k8s.io/server-storage created

$ kubectl -n cass-operator create -f https://raw.githubusercontent.com/datastax/cass-operator/v1.2.0/operator/example-cassdc-yaml/cassandra-3.11.6/example-cassdc-minimal.yaml
cassandradatacenter.cassandra.datastax.com/dc1 created

$ kubectl get all -n cass-operator
NAME                                READY   STATUS    RESTARTS   AGE
pod/cass-operator-78c6469c6-6qhsb   1/1     Running   0          139m
pod/cluster1-dc1-default-sts-0      2/2     Running   0          138m
pod/cluster1-dc1-default-sts-1      2/2     Running   0          138m
pod/cluster1-dc1-default-sts-2      2/2     Running   0          138m

NAME                                          TYPE           CLUSTER-IP    EXTERNAL-IP    PORT(S)             AGE
service/cass-operator-metrics                 ClusterIP      10.21.5.65    <none>         8383/TCP,8686/TCP   138m
service/cassandradatacenter-webhook-service   ClusterIP      10.21.0.89    <none>         443/TCP             139m
service/cluster1-dc1-all-pods-service         ClusterIP      None          <none>         <none>              138m
service/cluster1-dc1-service                  ClusterIP      None          <none>         9042/TCP,8080/TCP   138m
service/cluster1-seed-service                 ClusterIP      None          <none>         <none>              138m

NAME                            READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/cass-operator   1/1     1            1           139m

NAME                                      DESIRED   CURRENT   READY   AGE
replicaset.apps/cass-operator-78c6469c6   1         1         1       139m

NAME                                        READY   AGE
statefulset.apps/cluster1-dc1-default-sts   3/3     138m

$ CASS_USER=$(kubectl -n cass-operator get secret cluster1-superuser -o json | jq -r '.data.username' | base64 --decode)
$ CASS_PASS=$(kubectl -n cass-operator get secret cluster1-superuser -o json | jq -r '.data.password' | base64 --decode)

$ echo $CASS_USER
cluster1-superuser

$ echo $CASS_PASS
_5ROwp851l0E_2CGuN_n753E-zvEmo5oy31i6C0DBcyIwH5vFjB8_g
  • From the kubectl get all command above we can see there is an statefulset called statefulset.apps/cluster1-dc1-default-sts which controls the cassandra pods.
  • In order to create a LoadBalancer service that makes available all the pods managed by this statefulset we need to use the same labels assigned to them:
$ kubectl describe statefulset cluster1-dc1-default-sts -n cass-operator
Name:               cluster1-dc1-default-sts
Namespace:          cass-operator
CreationTimestamp:  Tue, 30 Jun 2020 12:24:34 +0200
Selector:           cassandra.datastax.com/cluster=cluster1,cassandra.datastax.com/datacenter=dc1,cassandra.datastax.com/rack=default
Labels:             app.kubernetes.io/managed-by=cass-operator
                    cassandra.datastax.com/cluster=cluster1
                    cassandra.datastax.com/datacenter=dc1
                    cassandra.datastax.com/rack=default
  • Now let's create the LoadBalancer service yaml and use the labels as selectors for the service:
apiVersion: v1
kind: Service
metadata:
  name: cassandra-loadbalancer
  namespace: cass-operator
  labels:
    cassandra.datastax.com/cluster: cluster1
    cassandra.datastax.com/datacenter: dc1
    cassandra.datastax.com/rack: default
spec:
  type: LoadBalancer
  ports:
  - port: 9042
    protocol: TCP
  selector:
    cassandra.datastax.com/cluster: cluster1
    cassandra.datastax.com/datacenter: dc1
    cassandra.datastax.com/rack: default

"My web application should be able to reach Cassandra on 9042. It seems load balancing is done for http/https. The Cassandra application is not a http/https request. So I don't need port 80 or 443."

  • When you create a Service of type LoadBalancer, a Google Cloud controller wakes up and configures a network load balancer in your project. The load balancer has a stable IP address that is accessible from outside of your project.

  • The network load balancer supports any and all ports. You can use Network Load Balancing to load balance TCP and UDP traffic. Because the load balancer is a pass-through load balancer, your backends terminate the load-balanced TCP connection or UDP packets themselves.

  • Now let's apply the yaml and note the Endpoint IPs of the pods being listed:

$ kubectl apply -f cassandra-loadbalancer.yaml 
service/cassandra-loadbalancer created

$ kubectl get service cassandra-loadbalancer -n cass-operator 
NAME                     TYPE           CLUSTER-IP    EXTERNAL-IP    PORT(S)          AGE
cassandra-loadbalancer   LoadBalancer   10.21.4.253   146.148.89.7   9042:30786/TCP   5m13s

$ kubectl describe svc cassandra-loadbalancer -n cass-operator
Name:                     cassandra-loadbalancer
Namespace:                cass-operator
Labels:                   cassandra.datastax.com/cluster=cluster1
                          cassandra.datastax.com/datacenter=dc1
                          cassandra.datastax.com/rack=default
Annotations:              Selector:  cassandra.datastax.com/cluster=cluster1,cassandra.datastax.com/datacenter=dc1,cassandra.datastax.com/rack=default
Type:                     LoadBalancer
IP:                       10.21.4.253
LoadBalancer Ingress:     146.148.89.7
Port:                     <unset>  9042/TCP
TargetPort:               9042/TCP
NodePort:                 <unset>  30786/TCP
Endpoints:                10.24.0.7:9042,10.24.2.7:9042,10.24.3.9:9042
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>
  • To test it, I'll use my cloud shell with a cassandra container to emulate your notebook using the LoadBalancer IP provided above:
$ docker run -it cassandra /bin/sh

# cqlsh -u cluster1-superuser -p _5ROwp851l0E_2CGuN_n753E-zvEmo5oy31i6C0DBcyIwH5vFjB8_g 146.148.89.7 9042                

Connected to cluster1 at 146.148.89.7:9042.
[cqlsh 5.0.1 | Cassandra 3.11.6 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cluster1-superuser@cqlsh> select * from system.peers;

 peer      | data_center | host_id                              | preferred_ip | rack    | release_version | rpc_address | schema_version                       | tokens
-----------+-------------+--------------------------------------+--------------+---------+-----------------+-------------+--------------------------------------+--------------------------
 10.24.3.9 |         dc1 | bcec6c12-49a1-41d5-be58-5150e99f5dfb |         null | default |          3.11.6 |   10.24.3.9 | e84b6a60-24cf-30ca-9b58-452d92911703 |  {'2248175870989649036'}
 10.24.0.7 |         dc1 | 68409f08-9d6e-4e40-91ff-f43581c8b6f3 |         null | default |          3.11.6 |   10.24.0.7 | e84b6a60-24cf-30ca-9b58-452d92911703 | {'-1105923522927946373'}

(2 rows)

"have a url so that my web application uses that URL to connect to the cassandra/K8s cluster instead of IP address. So I need a dns. Does it come by default in K8S? Would would be the url? Would K8s managing the dns mapping for me in some nodes get restarted?"

  • That documentation on cassandra-operator also has a section about Ingress, I recommend reading as well.
  • Kubernetes does not come with a default DNS name.
  • You will have to register a domain, point the DNS to the IP of the load balancer this way it will resolve the IP of the Network LoadBalancer.
  • The Network LoadBalancer is bound to a Static Public IP, any changes in Kubernetes nodes will not cause service unavailability.

If you have any question, let me know in the comments.

Upvotes: 6

Kamol Hasan
Kamol Hasan

Reputation: 13456

To output the stable external IP address under loadBalancer:ingress use the following command:

$ kubectl get service cass-operator-service -o yaml
... ...
... ...
status:
  loadBalancer:
    ingress:
    - ip: 203.0.113.10

Now you should be able to access the Cassandra at <load-balancer-ingress-ip>:9042

N.B.: Sometimes it takes a few minutes for GKE to configure the load balancer.

Update:

Add target port and correct labelSelector to your service YAML like below:

kind: Service
apiVersion: v1
metadata:
  name: cass-operator-service
spec:
  type: LoadBalancer
  ports:
    - port: 9042
      targetPort: 9042
  selector:
    # add labels which are specified in the Cassandra pods
    # not the operator.

Apply changes:

$ kubectl apply -f service.yaml

Upvotes: 0

Related Questions