Reputation: 16755
My question is built on the question and answers from this question - What's the difference between ClusterIP, NodePort and LoadBalancer service types in Kubernetes?
The question might not be well-formed for some of you.
I am trying to understand the differences between clusterIP
, nodePort
and Loadbalancer
and when to use these with an example. I suppose that my understanding of the following concept is correct
K8s consists of the following components
Here is the scenario:
My application has a web server
(always returning 200OK) and a database
(always returning the same value) for simplicity. Also, say I am on GCP
and I make images of webserver
and of the database
. Each of these will be run in their own respective pods
and will have 2 replicas.
I suppose I'll have two clusters (cluster-webserver (node1-web (containing pod1-web), node2-web (containing pod2-web))
and cluster-database (node1-db (containing pod1-db), node2-db (containing pod2-db))
. Each node will have its own ip
address (node1-webip, node2-webip, node1-dbip, node2-dbip
)
A client application (browser) should be able to access the web application from outside web
cluster but the database
shouldn't be accessible from outside database
cluster. However web
nodes should be able to access database
nodes)
web
(webServiceName
) and a service for database
then by default, I'll get only clusterIP
and a port
(or targetPort
).clusterIP
is an IP
assigned to a pod
, not the node
i.e. in my example, clusterIP gets assigned to pod1-web
, not node1-web
even though node1
has only pod1
.cluster IP
is accessible from only within the cluster, pod1-web
and pod2-web
can talk to each other and pod1-db
and pod2-db
can talk to each other using clusterIP/dns:port
or clusterIP/dns:targetPort
but web
can't talk to database
(and vice versa) and external client can't talk to web? Also, the nodes
are not accessible using the cluster IP
.dns
i.e. servicename.namespace.svc.cluster.local
would map the clusterIP
?clusterIP
? Where multiple instances of an application need to communicate with each other (eg master-slave configuration)?If I use nodePort
then K8s
will open a port
on each of the node
and will forward nodeIP/nodePort
to cluster IP (on pod)/Cluster Port
web
nodes now access database
nodes using nodeIP:nodePort
which will route the traffic to database's
clusterIP (on pod):clusterport/targertPort
? ( I have read that clusterIP/dns:nodePort will not work).node's
IP
? Is nodeIP
the IP
I'll get when I run describe pods
command?dns
equivalent for the node IP
as node IP
could change during failovers. Or does dns
now resolve to the node's IP instead of clusterIP
?K8s
will create endpoints
for each service
. Is endpoint
same as node
or is it same as pod
? If I run kubectl describe pods
or kubectl get endpoints
, would I get same IPs
)?As I am on GCP
, I can use Loadbalancer
for web
cluster to get an external IP. Using the external IP, the client application can access the web
service
I saw this configuration for a LoadBalancer
spec:
selector:
app: MyApp
ports:
- protocol: TCP
port: 80
targetPort: 9376
type: LoadBalancer
Questi
IP
and port 80
to outside world? What would be the value of nodePort
in this case?Upvotes: 1
Views: 3703
Reputation: 11158
My question is built on the question and answers from this question - What's the difference between ClusterIP, NodePort and LoadBalancer service types in Kubernetes?
The question might not be well-formed for some of you.
It's ok but in my opinion it's a bit too extensive for a single question and it could be posted as a few separate questions as it touches quite a few different topics.
I am trying to understand the differences between
clusterIP
,nodePort
andLoadbalancer
and when to use these with an example. I suppose that my understanding of the following concept is correct K8s consists of the following components
- Node - A VM or physical machine. Runs kubectl and docker process
Not kubectl
but kubelet
. You can check it by ssh-ing
into your node and runnign systemctl status kubelet
. And yes, it runs also some sort of container runtime environment. It doesn't have to be exactly docker.
- Pod - unit which encapsulates container(s) and volumes (storage). If a pod contains multiple containers then shared volume could be the way for process communication
- Node can have one or multiple pods. Each pod will have its own IP
That's correct.
- Cluster - replicas of a Node. Each node in a cluster will contain same pods (instances, type)
Not really. Kubernetes nodes
are not different replicas. They are part of the same kubernetes cluster but they are independent instances, which are capable of running your containerized apps. In kubernetes terminology it is called a workload. Workload isn't part of kubernetes cluster, it's something that you run on it. Your Pods
can be scheduled on different nodes and it doesn't always have to be an even distribution. Suppose you have kubernetes cluster consisting of 3 worker nodes (nodes on which workload can be scheduled as opposed to master node, that usually runs only kubernetes control plane components). If you deploy your application as a Deployment
e.g. 5 different replicas of the same Pod
are created. Usually they are scheduled on different nodes, but situation when node1 runs 2 replicas, node2 3 replicas and node3 zero replicas is perfectly possible.
You need to keep in mind that there are different clustering levels. You have your kubernetes cluster which basically is an environment to run your containerized workload.
There are also clusters within this cluster i.e. it is perfectly possible that your workload forms clusters as well e.g. you can have a database deployed as a StatefulSet
and it can run in a cluster. In such scenario, different stateful Pods
will form members or nodes of such cluster.
Even if your Pods
don't comunicate with each other but e.g. serve exactly the same content, Deployment
resoure makes sure that there is always certain number of replicas of such a Pod
that is up and running. If one kubernetes node for some reason becomes unavailable, it means such Pod
needs to be re-scheduled on one of the available nodes. So the replication of your workload isn't achieved by deploying it on different kubernetes nodes but by assuring that certain amout of replicas of a Pod
of a certain kind is always up and running, and it may be running on the same as well as on different kubernetes nodes.
Here is the scenario:
My application has a
web server
(always returning 200OK) and adatabase
(always returning the same value) for simplicity. Also, say I am onGCP
and I make images ofwebserver
and of thedatabase
. Each of these will be run in their own respectivepods
and will have 2 replicas.I suppose I'll have two clusters (
cluster-webserver (node1-web (containing pod1-web), node2-web (containing pod2-web))
andcluster-database (node1-db (containing pod1-db), node2-db (containing pod2-db))
. Each node will have its ownip
address (node1-webip, node2-webip, node1-dbip, node2-dbip
)
See above what I wrote about different clustering levels. Clusters formed by your app have nothing to do with kubernetes cluster nor its nodes. And I would say you would rather have 2 different microservices comunicating with each other and in some way also dependent on one another. But yes, you may see your database as a separate db cluster deployed within kubernetes cluster.
A client application (browser) should be able to access the web application from outside
web
cluster but thedatabase
shouldn't be accessible from outsidedatabase
cluster. Howeverweb
nodes should be able to accessdatabase
nodes)
- Question 1 - Am I correct that if I create a service for
web
(webServiceName
) and a service fordatabase
then by default, I'll get onlyclusterIP
and aport
(ortargetPort
).
Yes, ClusterIP
service type is often simply called a Service
because it's the default Service
type. If you don't specify type
like in this example, ClusterIP
type is created. To understand the difference between port
and targetPort
you can take a look at this answer or kubernetes official docs.
- Question 1.2 - Am I correct that
clusterIP
is anIP
assigned to apod
, not thenode
i.e. in my example, clusterIP gets assigned topod1-web
, notnode1-web
even thoughnode1
has onlypod1
.
Basically yes. ClusterIP
is one of the things that can be easily misunderstood as it is used to denote also a specific Service
type, but in this context yes, it's an internal IP assigned within a kubernetes cluster to a specific resource, in this case to a Pod
, but Service
object has it's own Cluster IP assigned. Pods
as part of kubernetes cluster get their own internal IPs (from kubernetes cluster perspective) - cluster IPs. Nodes can have completely different addressing scheme. They can also be private IPs but they are not cluster IPs, in other words they are not internal kubernetes cluster IPs from cluster perspective. Apart from those external IPs (from kubernetes cluster perspective), kubernetes nodes as legitimate API resources / objects have also their own Cluster IPs assigned.
You can check it by running:
kubectl get nodes --output wide
It will show you both internal and external nodes IPs.
- Question 1.3 - Am I correct that as
cluster IP
is accessible from only within the cluster,pod1-web
andpod2-web
can talk to each other andpod1-db
andpod2-db
can talk to each other usingclusterIP/dns:port
orclusterIP/dns:targetPort
butweb
can't talk todatabase
(and vice versa) and external client can't talk to web? Also, thenodes
are not accessible using thecluster IP
.
Yes, cluster IPs are only accessible from within the cluster. And yes, web pods and db pods can communicate with each other (typically the communication is initiated from web
pods) provided you exposed (in your case db pods) via ClusterIP
Service
. As already mentioned, this type of Service
exposes some set of Pods
forming one microservice to some other set of Pods
which need to comunicate with them and it exposes them only internally, within the cluster so no external client has access to them. You expose your Pods
externally by using LoadBalancer
, NodePort
or in many scenarios via ingress
(which under the hood also uses loadbalancer).
this fragment is not very clear to me:
but
web
can't talk todatabase
(and vice versa) and external client can't talk to web? Also, thenodes
are not accessible using thecluster IP
.
If you expose your db via Service
to be accessible from web
Pods
, they will have access to it. And if your web Pods
are exposed to the external world e.g. via LoadBalancer
or NodePort
, they will be accessible from outside. And yes, nodes won't be accessible from outside by their cluster IPs as they are private internal IPs of a kubernetes cluster.
- Question 1.4 - Am I correct that
dns
i.e.servicename.namespace.svc.cluster.local
would map theclusterIP
?
Yes, specifically cluster IP
of this Service
. More on that you can find here.
- Question 1.5 - For which type of applications I might use only
clusterIP
? Where multiple instances of an application need to communicate with each other (eg master-slave configuration)?
For something that doesn't need to be exposed externally, like some backend services, that are accessible not directly from outside but through some frontend Pods
which process external requests and pass them to the backend afterwards. It may be also used for database pods which practically never should be accessed directly from outside.
If I use
nodePort
thenK8s
will open aport
on each of thenode
and will forwardnodeIP/nodePort
tocluster IP (on pod)/Cluster Port
Yes, in NodePort
Service configuration this destination port exposed by a Pod
is called targetPort
. Somewhere in between there is also a port
, wchich refers to a port of the Service
itself. So the Service
has its ClusterIP
(different then backend Pods
IPs) and its port which usually is the same as targetPort
(targetPort defaults to the value set for port
) but can be set to a different value.
- Question 2 - Can
web
nodes now accessdatabase
nodes usingnodeIP:nodePort
which will route the traffic todatabase's
clusterIP (on pod):clusterport/targertPort
?
I think you've mixed it up a bit. If web
is something external to the kubernetes cluster, it might have sense to access Pods
deployed on kubernetes cluster via nodeIP:nodePort
but if it's part of the same kubernetes cluster, it can use simple ClusterIP
Service
.
( I have read that clusterIP/dns:nodePort will not work).
From the external world of cours it won't work as Cluster IPs
are not accessible from outside, they are internal kubernetes IPs. But from within the cluster ? It's perfectly possible. As I said in different part of my answer, kubernetes nodes have also their cluster IPs and it's perfectly possible to access your app on nodePort
but from within the cluster i.e. from some other Pod
. So when you look at internal (cluster) IP addresses of the nodes in my example it is also perfectly possible to run:
root@nginx-deployment-85ff79dd56-5lhsk:/# curl http://10.164.0.8:32641
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
...
- Question 2.1 - How do I get a
node's
IP
? IsnodeIP
theIP
I'll get when I rundescribe pods
command?
To check IPs of your nodes run:
kubectl get nodes --output wide
It will show you both their internal (yes, nodes also have their ClusterIPs!) and external IPs.
- Question 2.2 - Is there a
dns
equivalent for thenode IP
asnode IP
could change during failovers. Or doesdns
now resolve to the node's IP instead ofclusterIP
?
No, there isn't. Take a look at What things get DNS names?
- Question 2.3 - I read that
K8s
will createendpoints
for eachservice
. Isendpoint
same asnode
or is it same aspod
? If I runkubectl describe pods
orkubectl get endpoints
, would I get sameIPs
)?
No, endpoints
is another type of kubernetes API object / resource.
$ kubectl api-resources | grep endpoints
endpoints ep true Endpoints
If you run:
kubectl explain endpoints
you will get it's detailed description:
KIND: Endpoints
VERSION: v1
DESCRIPTION:
Endpoints is a collection of endpoints that implement the actual service.
Example: Name: "mysvc", Subsets: [
{
Addresses: [{"ip": "10.10.1.1"}, {"ip": "10.10.2.2"}],
Ports: [{"name": "a", "port": 8675}, {"name": "b", "port": 309}]
},
{
Addresses: [{"ip": "10.10.3.3"}],
Ports: [{"name": "a", "port": 93}, {"name": "b", "port": 76}]
},
]
Usually you don't have to worry about creating endpoints
resource as it is created automatically. So to answer your question, endpoints
stores information about Pods
IPs and keeps track on them as Pods
can be destroyed and recreated and their IPs are subject to change. For a Service
to keep routing the traffic properly, although Pods
IPs change, an object like endpoints
must exist which keeps track of those IPs.
You can easily check it by yourself. Simply create a deployment, consisting of 3 Pods
and expose it as a simple ClusterIP
Service
. Check its endpoint
object. Then delete one Pod
, verify its IP has changed and check again its endpoint
object. You can do it by running:
kubectl get ep <endpoints-object-name> -o yaml
or
kubectl describe ep <endpoints-object-name>
So basically different endpoints (as many as backend Pods
exposed by a certain Service
) are internal (ClusterIP) addresses of Pods
exposed by the Service
but endpoints
object / API resource is a single kubernetes resource that keeps track of those endpoints. I hope this is clear.
As I am on
GCP
, I can useLoadbalancer
forweb
cluster to get an external IP. Using the external IP, the client application can access theweb
serviceI saw this configuration for a
LoadBalancer
spec: selector: app: MyApp ports: - protocol: TCP port: 80 targetPort: 9376 type: LoadBalancer
Questi
- Question 3 - Is it exposing an external
IP
and port80
to outside world? What would be the value ofnodePort
in this case?
Yes, under the hood a call to GCP API is made so that external http/https loadbalancer with a public IP is created.
Suppose you have a Deployment
called nginx-deployment
. If you run:
kubectl expose deployment nginx-deployment --type LoadBalancer
It will create a new Service
of LoadBalancer
type. If you then run:
kubectl get svc
you will see your LoadBalancer
Service
has both external IP and cluster IP assigned.
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
nginx-deployment LoadBalancer 10.3.248.43 <some external ip> 80:32641/TCP 102s
If you run:
$ kubectl get svc nginx-deployment
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
nginx-deployment LoadBalancer 10.3.248.43 <some external ip> 80:32641/TCP 👈 16m
You'll notice that nodePort
value for this Service
has been also set, in this case to 32641
. If you want to dive into it even deeper, run:
kubectl get svc nginx-deployment -o yaml
and you will see it in this section:
...
spec:
clusterIP: 10.3.248.43
externalTrafficPolicy: Cluster
ports:
- nodePort: 32641 👈
port: 80
protocol: TCP
targetPort: 80
selector:
app: nginx
sessionAffinity: None
type: LoadBalancer 👈
...
As you can see although the Service
type is LoadBalancer
it also has its nodePort
value set. And you can test that it works by accessing your Deployment
using this port, but not on the IP
of the LoadBalancer
but on IPs
of your nodes. I know it may seem pretty confusing as LoadBalancer
and NodePort
are two different Service
types. LB
needs to distrute the incoming traffic to some backend Pods
(e.g. managed by a Deployment
) and needs this nodePort
value set in its own specification to be able to route the traffic to Pods
scheduled on different nodes. I hope this is a bit clearer now.
Upvotes: 3