Core Kubernetes Concepts
By Diwanshu Shekhar
- 7 minutes read - 1338 wordsAssuming you already have some familiarity with Kubernetes, I’m going to lay out the core concepts of Kubernetes that are really important to understand -
Pods
First is the concept of Pods. Going back to the idea of containerization, containers are only supposed to run one process. In situations where closely related processes are meant to be run together, Kubernetes groups them together into a group called Pod. To summarize, Pod is something where one or more containers run where each container runs a single process and the processes in each container are closely related.
One important point to note is that all containers in a pod share the same Network space meaning they have the same ip address and therefore we need to make sure the all containers in a pod have a unique port to avoid port conflicts. On the same token, containers in different pods can never have port conflicts. Because Pods have unique ip addresses, we can therefore conclude that Pods are logical hosts and behave much like physical hosts.
Another point to remember is pod is also the basic unit of scaling. Kubernetes can’t horizontally scale individual containers; instead, it scales whole pods. Something to keep in in mind when we’re deciding which container should be grouped together in a single pod.
Services
Services are a way to expose our container application in a Pod. We said earlier that Pods have their own IP addresses. We could potentially use the IP address of the pod and the port number of the container application to expose it to other applications outside or internal to Kubernetes. But, this is not going to work very well because Pods are ephemeral in nature which means IP addresses will keep changing. Another reason is the scaling - several Pods will be serving the same application and other applications will need a single point to contact to talk to the application. This is where Service comes into play in Kubernetes.
A Kubernetes Service is a resource you create to make a single, constant point of entry to a group of pods providing the same service. Each service has an IP address and port that never change while the service exists. A Kubernetes Services forwards requests from other applications to an application in a Pod or to another application external to Kubernetes. We typically assign a service to a Pod with the help of selector field of the Service resource. Similarly, we assign a service to external servers using Endpoint resource. Behind the scene, in both types of assignments, the Endpoint resource sits between the service and the actual server
Until now, we’ve seen how Kubernetes Service allows us to point to a specific server internal to Kubernetes or to an external service. But, another important benefit of having a service is that it can also expose an application running in a Pod to external world. This can be done by setting the service type to one of the following three types -
- NodePort: The NodePort service type exposes a port in all nodes of a cluster
- LoadBalancing: This service type is applicable to Kubernetes cluster hosted in a cloud infrastructure. This exposes the service to the cloud infrastructure’s load balancer
- Ingress: Each LoadBalancer service requires its own load balancer with its own public IP address, whereas an Ingress only requires one, even when providing access to dozens of services. When a client sends an HTTP request to the Ingress, the host and path in the request determine which service the request is forwarded to. In order to use Ingress, an Ingress Controller Add On must be added to the Kubernetes cluster. One popular one is the nginx-ingress-controller.
Volumes
Each container in a pod has its own filesystem which is the container’s own filesystem. Every Pod start with a brand new filesystem when it is created and they have to lose it when the Pod is deleted or restarted. What if we want the data generated by Pod to persist, so when a Pod is created it, it starts from the place where the previous Pod left off. This is where Kubernetes Volumes come to rescue.
Kubernetes volumes are a component of a pod and are thus defined in the pod’s specification—much like containers. They aren’t a standalone Kubernetes object and cannot be created or deleted on their own. A volume is available to all containers in the pod, but it must be mounted in each container that needs to access it. In each container, you can mount the volume in any location of its filesystem.
You can define volumes in Pod.spec.volumes kubectl explain Pod.spec.volumes
and mount those volumes to containers in Pod.containers.volumeMounts kubectl explain Pod.spec.containers.volumeMounts
. There are different types of volumes that are available and some of them are: emptyDir, gitRepo, hostPath, gcePersistentDisk, awsElasticBlockStore etc. But, in order to make Kubernetes applications infrastructure agnostic, two abstract layers called, PersistentVolume and PersistentVolumeClaim are added to Kubernetes that hides the underlying storage infrastructure from the developer. PersistentVolumes are Kubernetes resource that abstracts the underlying storage infrastructure. Pods can use the PersistentVolumes in the volumes field of the Pod manifest
ConfigMaps and Secrets
Environment variables can be defined at a container level under Pod.spec.containers for configuring applications, but this approach is limiting in a sense that now we need to have separate Pod definition for a particular set of app configurations. In practice, this means having a separate Pod definitions for production and development environments. Kubernetes provides a resource called ConfigMap that helps decouple application from its configurations. ConfigMap can be accessed by a Pod by either using the env field of container definition or by using the configmap volume and mounting it to a container filesystem (we discussed this in the Volume section).
Secrets are just another way of storing app configurations that are sensitive in nature such as passwords, encryption key, tls certificates etc. Every pod in Kubernetes is assigned a default secret that allows each Pod to talk to the Kubernetes API Server. Obviously, a custom secret can also be assigned to a container as after all we said secrets are just a variant of ConfigMap. This means we can create a secret volume and mount it to a container’s filesystem.
The contents of a Secret’s entries are shown as Base64-encoded strings, whereas those of a ConfigMap are shown in clear text. The reason for using Base64 encoding is simple. A Secret’s entries can contain binary values, not only plain-text. Base64 encoding allows you to include the binary data in YAML or JSON, which are both plain-text formats. Because not all sensitive data is in binary form, Kubernetes also allows setting a Secret’s values through the stringData field.
Deployments
A Deployment resource is one of the important highlights of Kubernetes as it allows to us to automatically update Pods without us making manual changes such as creating new ReplicaSets, new Pods and deleting the old Pods. All we need to do is define a Deployment resource which is very similar to defining a ReplicaSet that manages pods. In fact, a Deployment does create a ReplicaSet behind the scene for us that in turn manages Pods. The reason we have an additional resource called Deployment and not just ReplicaSet is because Deployments takes care of automatically updating Pods with the help of underlying ReplicaSets. An update to a Deployment specifically to the image field value triggers an update rollout of the pods. The strategy of the rollout can be defined in the deployment manifest. Two deployment strategies are available - rollingUpdate, and recreate which can be defined as values of the field strategy in the Deployment manifest kubectl explain Deployment.spec.strategy
. A rollingUpdate has no downtime, but an applications must support two versions of the same application. If this is a constraint, then one may opt for the recreate strategy, but the downside of this is that there will some downtime during the update. kubectl rollout status deployment <name_of_deployment>
and several other rollout commands can be useful in tracking and troubleshooting deployment