kubernetes architecture Link to heading

  • one or more master nodes
  • one or more worker nodes
  • distributed key - value storage (for state purposes)

master nodes Link to heading

master nodes are entry point for administrative tasks. if there are more master nodes, then it would be in HA (high availability) mode.

in this time, one of them is a ’leader’ and others are followers.

all master nodes connect to distributed key-value storage. kubernetes is using etcd. the aim is to manage the cluster state.

a master node has following components:

  • API server: user send admin task to kubernetes. API server receive the requests, validates and processes them. finally, the result will be storing in etcd
  • scheduler: schedules the work to different worker nodes. scheduler schedules the work in term of pods and services.
  • controller manager: manage different control loops. each of control loops knows about the desired state of the objects it manages, and watches their current state through API server. corrective steps will be performed if the state is not the desired state.
  • etcd: stores cluster state, configuration details.

worker nodes Link to heading

a worker node is a machine runs applications using pods. it is controlled by the master node.

a Pod is the scheduling unit in k8s. it is a logical collection of one or more containers which are always scheduled together.

a worker node has the following components:

  • container runtime: examples of container runtimes are containerd, rkt, lxd. docker is using containerd
  • kubelet: kubelet communicates with master node. it receives pod defination that api server sends, and runs the containers associated with the pod. kubelet connects to container runtime using Container Runtime Interface (CRI) CRI
    • cri shims: dockershim and cri-containerd
  • kube-proxy:

network setup challenges Link to heading

  • a unique ip is assigned to each pod
  • containers in a pod can communicate to each other
  • the pod is able to commuicate with other pods in the cluster
  • if configured, the application deployed inside a pod is accessible from the external world.

set a unique ip to each pod Link to heading

container runtime offloads ip assignment to CNI, which connects to underlying plugins. once the ip address is given by respective plugin, CNI forwards it back to the requested container runtime.

cni

containers communications inside pod Link to heading

inside a pod, containers share the network namespaces, so that they can reach to each other via localhost.

pod to pod communication Link to heading

  • routable pods and nodes, using the underlying physical infrastructure
  • using software defined networking, like Flannel, Weave, Calico

pod communicates to outside world Link to heading

using kube-proxy

different kubernetes configurations Link to heading

  • all-in-one single node installation: Minikube. it is useful for development
  • single-node etcd, single master and multiple workers
  • single-node etcd, multiple masters and multiple workers
  • multi-node etcd, mutiple masters and multiple workers

pod Link to heading

a pod is the smallest and simplest kubernetes object. It is the unit of deployment in Kubernetes, which represents a single instance of the application. A Pod is a logical collection of one or more containers.

pods

labels Link to heading

labels are key-value pairs that attached to any kubernets object. Labels are used to organize and select a subset of objects, based on the requirements in place. Many objects can have the same Label(s)

also we can use label selector to select object with certain labels, eg: 'env==dev'

ReplicationControllers Link to heading

a ReplicationController (rc) is a part of master node’s controller manager. rc will create or kill pods to match the desire count set by us. we always use controllers like rc to create and manage pods.

ReplicaSet Link to heading

A ReplicaSet (rs) is a next generation ReplicationController.

rs

deployments Link to heading

Deployment objects provide declarative updates to Pods and ReplicaSets. The DeploymentController is part of the master node’s controller manager, and it makes sure that the current state always matches the desired state.

deployment

deployment rollout Link to heading

a rollout is a update on deployment object. eg, if we update image from 1.7.9 to 1.9.1, then a rollout happens.

rollout

after rollout happens, deployment will keep monitoring ReplicaSet and rollback to previous known status.

namespaces Link to heading

If we have numerous users whom we would like to organize into teams/projects, we can partition the Kubernetes cluster into sub-clusters using Namespaces.

To list all the Namespaces, we can run the following command:

$ kubectl get namespaces
NAME          STATUS       AGE
default       Active       11h
kube-public   Active       11h
kube-system   Active       11h

Generally, Kubernetes creates two default Namespaces: kube-system and default. The kube-system Namespace contains the objects created by the Kubernetes system. The default Namespace contains the objects which belong to any other Namespace. By default, we connect to the default Namespace. kube-public is a special Namespace, which is readable by all users and used for special purposes, like bootstrapping a cluster.

authentication Link to heading

kubernetes has two kinds of users:

  • normal users: normal users were managed outside of k8s clusters.
  • service accounts: they are used to in-cluster communications with the API server.

there are different authentication module that can be used.

services Link to heading

label selector Link to heading

label selector is important here. because pods are ephermal in nature, so that there cannot be static ip. so when user wants to connect to pods, kubernets uses a level of abstract to find it – services. the problem is how to find it? since the pods are with ephermal nature. Using label selector.

kube-proxy Link to heading

All of the worker nodes run a daemon called kube-proxy, which watches the API server on the master node for the addition and removal of Services and endpoints.

service discovery Link to heading

kubernetes has an add-on for DNS, which created a DNS record for each service and an string with specific format so that service under same namespace can access others just use their names.

other pods from other namespace can access this pod using name with namespace suffix.

service type Link to heading

cluster IP and nodePort

The NodePort ServiceType is useful when we want to make our Services accessible from the external world.

LoadBalancer

ExternalIP

Please note that ExternalIPs are not managed by Kubernetes. The cluster administrators has configured the routing to map the ExternalIP address to one of the nodes.

External Name

useful commands Link to heading

# list the deployments
kubectl get deployments

# list the replicas
kubectl get replicasets

# list the pods
kubectl get pods
  • look at the pods details
$ kubectl describe pod webserver-74d8bd488f-dwbzz

Name:           webserver-74d8bd488f-dwbzz
Namespace:      default
Node:           minikube/192.168.99.100
Start Time:     Thu, 22 Mar 2018 09:17:33 +0530
Labels:         k8s-app=webserver
                pod-template-hash=3084680449
Annotations:    <none>
Status:         Running
IP:             172.17.0.5
Controlled By:  ReplicaSet/webserver-74d8bd488f
Containers:
  webserver:
    Container ID:   docker://96302d70903fe3b45d5ff3745a706d67d77411c5378f1f293a4bd721896d6420
    Image:          nginx:alpine
    Image ID:       docker-pullable://nginx@sha256:8d5341da24ccbdd195a82f2b57968ef5f95bc27b3c3691ace0c7d0acf5612edd
    Port:           <none>
    State:          Running
      Started:      Thu, 22 Mar 2018 09:17:33 +0530
    Ready:          True
    Restart Count:  0

# list the pods with the attached labels
$ kubectl get pods -L k8s-app,label2
NAME                         READY   STATUS    RESTARTS   AGE   K8S-APP     LABEL2
webserver-74d8bd488f-dwbzz   1/1     Running   0          14m   webserver   <none>
webserver-74d8bd488f-npkzv   1/1     Running   0          14m   webserver   <none>
webserver-74d8bd488f-wvmpq   1/1     Running   0          14m   webserver   <none>

# list the pods with label level set to webserver
$ kubectl get pods -l k8s-app=webserver
NAME                         READY     STATUS    RESTARTS   AGE
webserver-74d8bd488f-dwbzz   1/1       Running   0          17m
webserver-74d8bd488f-npkzv   1/1       Running   0          19m
webserver-74d8bd488f-wvmpq   1/1       Running   0          17m
  • delete deployments
$ kubectl delete deployments webserver
deployment "webserver" deleted
  • create a deployment
$ kubectl create -f webserver.yaml
deployment "webserver" created

webserver.yaml will be like below:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: webserver
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:alpine
        ports:
        - containerPort: 80

the command above will also create replicasets and pods.

  • exporting specific port to outside world

from above we see how to create a deployment from a yaml file. with ServiceType we can define the access mode of a given Service.

with NodePort service type, kubenetes opens up a static port on all the worker nodes. let’s see yaml file from below:

apiVersion: v1
kind: Service
metadata:
  name: web-service
  labels:
    run: web-service
spec:
  type: NodePort
  ports:
  - port: 80
    protocol: TCP
  selector:
    app: nginx 
  • list the services
$ kubectl get svc
NAME          TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
kubernetes    ClusterIP   10.96.0.1      <none>        443/TCP        1d
web-service   NodePort    10.110.47.84   <none>        80:31074/TCP   12s
  • get ip of application

minikube ip

  • access application in browser

minikube service web-service

  • liveness command

liveness container will check the health of application

In the following example, the kubelet sends the HTTP GET request to the /healthz endpoint of the application, on port 8080. If that returns a failure, then the kubelet will restart the affected container; otherwise, it would consider the application to be alive.

livenessProbe:
      httpGet:
        path: /healthz
        port: 8080
        httpHeaders:
        - name: X-Custom-Header
          value: Awesome
      initialDelaySeconds: 3
      periodSeconds: 3

so that tcp probe

livenessProbe:
      tcpSocket:
        port: 8080
      initialDelaySeconds: 15
      periodSeconds: 20
  • readiness probe

readiness probe ensures certain condition occur before application serves traffic.

volume management Link to heading

In Kubernetes, a Volume is attached to a Pod and shared among the containers of that Pod.

  • emptyDir

An empty Volume is created for the Pod as soon as it is scheduled on the worker node. The Volume’s life is tightly coupled with the Pod. If the Pod dies, the content of emptyDir is deleted forever.

  • hostPath
  • gcePersistentDisk
  • awsElasticBlockStore
  • nfs
  • iscsi
  • secret
  • presistentVolumeClaim

in containerized world, k8s provides APIs for users to interact with PersistentVolume(PV)

PVC Link to heading

a PersistentVolumeClaim(PVC) is a request for storage by a user.

persistentvolumeclaim

Container Storage Interface CSI Link to heading

storage vendors and community members from different orchestrators work together to standardize and form CSI.

k8s (k8s v1.9) alpha support CSI made it easy to install CSI compliant plugins.

config app using ConfigMaps Link to heading

ConfigMaps allow to decouple the configuration from the container image. we can pass conf as key-value pair.

examples:

  • by literal:
$ kubectl create configmap my-config --from-literal=key1=value1 --from-literal=key2=value2
configmap "my-config" created 
  • by file:
$ kubectl get configmaps my-config -o yaml
apiVersion: v1
data:
  key1: value1
  key2: value2
kind: ConfigMap
metadata:
  creationTimestamp: 2017-05-31T07:21:55Z
  name: my-config
  namespace: default
  resourceVersion: "241345"
  selfLink: /api/v1/namespaces/default/configmaps/my-config
  uid: d35f0a3d-45d1-11e7-9e62-080027a46057

configuration file for ConfigMap creation Link to heading

First, we need to create a configuration file. We can have a configuration file with the content like:

apiVersion: v1
kind: ConfigMap
metadata:
  name: customer1
data:
  TEXT1: Customer1_Company
  TEXT2: Welcomes You
  COMPANY: Customer1 Company Technology Pct. Ltd.

create ConfigMap Link to heading

$ kubectl create -f customer1-configmap.yaml
configmap "customer1" created

Ingress Link to heading

Ingress is another method to access the application

According to kubernetes.io

An Ingress is a collection of rules that allow inbound connections to reach the cluster Services.

here is a diagram explaining what is an ingress

ingress

with ingress, users don’t connect directly to a Service, users reach the ingress point first.

those forwarding action is done by ingress controller.

Ingress Controller Link to heading

An Ingress Controller is an application which watches the Master Node’s API server for changes in the Ingress resources and updates the Layer 7 Load Balancer accordingly. Kubernetes has different Ingress Controllers, and, if needed, we can also build our own. GCE L7 Load Balancer and Nginx Ingress Controller are examples of Ingress Controllers.

a typical config file would be like this:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: web-ingress
  namespace: default
spec:
  rules:
  - host: blue.example.com
    http:
      paths:
      - backend:
          serviceName: webserver-blue-svc
          servicePort: 80
  - host: green.example.com
    http:
      paths:
      - backend:
          serviceName: webserver-green-svc
          servicePort: 80

and then create it.

$ kubectl create -f webserver-ingress.yaml

credit to Link to heading