CKA Exam Commands ¶
Basics¶
Create pod in finance namespace
k run redis --image=redis -n finance
Create a service and expose it on port 6379
apiVersion: v1
kind: Service
metadata:
name: redis-service
spec:
selector:
app.kubernetes.io/name: MyApp
ports:
- protocol: TCP
port: 6379
targetPort: 6379
Create a deployment named webapp using the image kodekloud/webapp-color with 3 replicas
apiVersion: apps/v1
kind: Deployment
metadata:
name: webapp
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: kodekloud/webapp-color
ports:
- containerPort: 80
Create a new pod called custom-nginx using the nginx image and expose it on container port 8080
apiVersion: v1
kind: Pod
metadata:
name: custom-nginx
labels:
tier: db
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 8080
Create a new namespace called dev-ns.
k create ns dev-ns
Create a pod called httpd using the image httpd:alpine in the default namespace. Next, create a service of type ClusterIP by the same name (httpd). The target port for the service should be 80
k run httpd --image=httpd:alpine --port=80 --expose
service/httpd created
pod/httpd created
Scheduling¶
Node Affinity¶
# apply label
k label nodes node01 color=blue
# create a deployment
k create deployment blue --replicas=3 --image=nginx
# check the taints on the node
kubectl describe node controlplane | grep -i taints
Create a new deployment named red with the nginx image and 2 replicas, and ensure it gets placed on the controlplane node only. Use the label key node-role.kubernetes.io/control-plane
which is already set on the controlplane node.
# use the exists operator as shown below
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-role.kubernetes.io/control-plane
operator: Exists
Manual Scheduling¶
Manually schedule the pod on node01
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
nodeName: node01
containers:
- image: nginx
name: nginx
Labels and Selectors¶
# count the number of pods with env=dev label
k get po -l env=dev --show-labels | wc
# Identify the POD which is part of the prod environment, the finance BU and of frontend tier?
k get pod --selector env=prod,bu=finance,tier=frontend --show-labels
Taints and Tolerations¶
Node affinity is a property of Pods that attracts them to a set of nodes (either as a preference or a hard requirement). Taints are the opposite -- they allow a node to repel a set of pods.
Tolerations are applied to pods. Tolerations allow the scheduler to schedule pods with matching taints. Tolerations allow scheduling but don't guarantee scheduling: the scheduler also evaluates other parameters as part of its function.
Taints and tolerations work together to ensure that pods are not scheduled onto inappropriate nodes. One or more taints are applied to a node; this marks that the node should not accept any pods that do not tolerate the taints.
# You add a taint to a node using kubectl taint. For example,
kubectl taint nodes node1 key1=value1:NoSchedule
# To remove the taint added by the command above, you can run:
kubectl taint nodes node1 key1=value1:NoSchedule-
# Specify a toleration for a pod in the PodSpec.
tolerations:
- key: "key1"
operator: "Equal"
value: "value1"
effect: "NoSchedule"
tolerations:
- key: "key1"
operator: "Exists"
effect: "NoSchedule"
# Here's an example of a pod that uses tolerations:
apiVersion: v1
kind: Pod
metadata:
name: nginx
labels:
env: test
spec:
containers:
- name: nginx
image: nginx
imagePullPolicy: IfNotPresent
tolerations:
- key: "example-key"
operator: "Exists"
effect: "NoSchedule"
Check taints on the nodes
k describe nodes node01 | grep -i taint # use describe instead of get
Create a taint on node01 with key of spray, value of mortein and effect of NoSchedule
k taint node node01 spray=mortain:NoSchedule
Create another pod named bee with the nginx image, which has a toleration set to the taint mortein.
First do dry run using
k run bee --image=nginx --dry-run=client -o yaml > test_pod.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: bee
name: bee
spec:
tolerations:
- key: "spray"
operator: "Equal"
value: "mortein"
effect: "NoSchedule"
containers:
- image: nginx
name: bee
resources: {}
dnsPolicy: ClusterFirst
restartPolicy: Always
status: {}
Resource Limits¶
If the node where a Pod is running has enough of a resource available, it's possible (and allowed) for a container to use more resource than its request for that resource specifies. However, a container is not allowed to use more than its resource limit.
# replace the pod using the replace command
k replace --force -f /tmp/kubectl-edit-2304618812.yaml
pod "elephant" deleted
pod/elephant replaced
DeamonSets¶
A DaemonSet ensures that all (or some) Nodes run a copy of a Pod. As nodes are added to the cluster, Pods are added to them. As nodes are removed from the cluster, those Pods are garbage collected. Deleting a DaemonSet will clean up the Pods it created.
Some typical uses of a DaemonSet are:
running a cluster storage daemon on every node
running a logs collection daemon on every node
running a node monitoring daemon on every node
On how many nodes are the pods scheduled by the DaemonSet kube-proxy?
k -n kube-system describe ds kube-proxy # check the pod status
Deploy a DaemonSet for FluentD Logging with Name: elasticsearch, Namespace: kube-system and Image: registry.k8s.io/fluentd-elasticsearch:1.20
How to create a DS ?
An easy way to create a DaemonSet is to first generate a YAML file for a Deployment with the command kubectl create deployment elasticsearch --image=registry.k8s.io/fluentd-elasticsearch:1.20 -n kube-system --dry-run=client -o yaml > fluentd.yaml
. Next, remove the replicas, strategy and status fields from the YAML file using a text editor. Also, change the kind from Deployment to DaemonSet. Finally, create the Daemonset by running kubectl create -f fluentd.yaml
Static Pods¶
How many static pods exist in this cluster in all namespaces?
Run the command kubectl get pods --all-namespaces
and look for those with -controlplane
appended in the name
What is the path of the directory holding the static pod definition files?
/etc/kubernetes/manifests/
Create a static pod named static-busybox that uses the busybox image and the command sleep 1000
kubectl run --restart=Never --image=busybox static-busybox --dry-run=client -o yaml --command -- sleep 1000 > /etc/kubernetes/manifests/static-busybox.yaml
The path need not be /etc/kubernetes/manifests. Make sure to check the path configured in the kubelet configuration file.
root@controlplane:~# ssh node01
root@node01:~# ps -ef | grep /usr/bin/kubelet
root 4147 1 0 14:05 ? 00:00:00 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --container-runtime-endpoint=unix:///var/run/containerd/containerd.sock --pod-infra-container-image=registry.k8s.io/pause:3.9
root 4773 4733 0 14:05 pts/0 00:00:00 grep /usr/bin/kubelet
root@node01:~# grep -i staticpod /var/lib/kubelet/config.yaml
staticPodPath: /etc/just-to-mess-with-you
Logging and Monitoring¶
Identify the POD that consumes the most Memory(bytes) in default namespace.
k top pod
NAME CPU(cores) MEMORY(bytes)
elephant 19m 32Mi
lion 1m 18Mi
rabbit 129m 252Mi
Appilcation Lifecycle Maintainance¶
What command is run at container startup?
FROM python:3.6-alpine
RUN pip install flask
COPY . /opt/
EXPOSE 8080
WORKDIR /opt
ENTRYPOINT ["python", "app.py"]
Ans is app.py
# Another question
FROM python:3.6-alpine
RUN pip install flask
COPY . /opt/
EXPOSE 8080
WORKDIR /opt
ENTRYPOINT ["python", "app.py"]
CMD ["--color", "red"]
Ans is python app.py --color red
# Question 3
apiVersion: v1
kind: Pod
metadata:
name: webapp-green
labels:
name: webapp-green
spec:
containers:
- name: simple-webapp
image: kodekloud/webapp-color
command: ["--color","green"]
---
FROM python:3.6-alpine
RUN pip install flask
COPY . /opt/
EXPOSE 8080
WORKDIR /opt
ENTRYPOINT ["python", "app.py"]
CMD ["--color", "red"]
Ans is --color green
What command is run at container startup? Assume the image was created from the Dockerfile in this directory
FROM python:3.6-alpine
RUN pip install flask
COPY . /opt/
EXPOSE 8080
WORKDIR /opt
ENTRYPOINT ["python", "app.py"]
CMD ["--color", "red"]
---
apiVersion: v1
kind: Pod
metadata:
name: webapp-green
labels:
name: webapp-green
spec:
containers:
- name: simple-webapp
image: kodekloud/webapp-color
command: ["python", "app.py"]
args: ["--color", "pink"]
Ans is python app.py --color pink
#Create a pod with the given specifications. By default it displays a blue background. Set the given command line arguments to change it to green. Command line arguments: --color=green
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: webapp-green
name: webapp-green
spec:
containers:
- image: kodekloud/webapp-color
args: ["--color","green"]
name: webapp-green
resources: {}
dnsPolicy: ClusterFirst
restartPolicy: Always
status: {}
Env variables¶
# create a cm with 2 params as shown below
k create cm webapp-config-map --from-literal APP_COLOR=darkblue --from-literal APP_OTHER=disregard
# Update the environment variable on the POD to use only the APP_COLOR key from the newly created ConfigMap.
apiVersion: v1
kind: Pod
metadata:
labels:
name: webapp-color
name: webapp-color
namespace: default
spec:
containers:
- name: webapp-color
image: kodekloud/webapp-color
env:
- name: APP_COLOR
valueFrom:
configMapKeyRef:
name: webapp-config-map
key: APP_COLOR
Secrets¶
# The reason the application is failed is because we have not created the secrets yet. Create a new secret named db-secret with the data given below.
k create secret generic db-secret --from-literal DB_Host=sql01 --from-literal DB_User=root --from-literal DB_Password=password123
Configure webapp-pod to load environment variables from the newly created secret.
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: "2023-11-19T06:03:03Z"
labels:
name: webapp-pod
name: webapp-pod
namespace: default
spec:
containers:
- image: kodekloud/simple-webapp-mysql
imagePullPolicy: Always
name: webapp
envFrom:
- secretRef:
name: db-secret
Multi container pods¶
Create a multi-container pod with 2 containers
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: yellow
name: yellow
spec:
containers:
- image: busybox
name: lemon
resources: {}
- image: redis
name: gold
dnsPolicy: ClusterFirst
restartPolicy: Always
status: {}
Init Containers¶
# Sample
apiVersion: v1
kind: Pod
metadata:
name: myapp-pod
labels:
app.kubernetes.io/name: MyApp
spec:
containers:
- name: myapp-container
image: busybox:1.28
command: ['sh', '-c', 'echo The app is running! && sleep 3600']
initContainers:
- name: init-myservice
image: busybox:1.28
command: ['sh', '-c', "until nslookup myservice.$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace).svc.cluster.local; do echo waiting for myservice; sleep 2; done"]
- name: init-mydb
image: busybox:1.28
command: ['sh', '-c', "until nslookup mydb.$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace).svc.cluster.local; do echo waiting for mydb; sleep 2; done"]
Cluster Maintainance¶
We need to take node01 out for maintenance. Empty the node of all applications and mark it unschedulable.
k drain node01 --ignore-daemonsets
node/node01 cordoned
Warning: ignoring DaemonSet-managed Pods: kube-flannel/kube-flannel-ds-f5ttj, kube-system/kube-proxy-vg6wp
evicting pod default/blue-6b478c8dbf-vrh86
evicting pod default/blue-6b478c8dbf-j7g8d
pod/blue-6b478c8dbf-j7g8d evicted
pod/blue-6b478c8dbf-vrh86 evicted
node/node01 drained
drain error due to no controller
k drain node01 --ignore-daemonsets
node/node01 cordoned
error: unable to drain node "node01" due to error:cannot delete Pods declare no controller (use --force to override): default/hr-app, continuing command...
There are pending nodes to be drained:
node01
cannot delete Pods declare no controller (use --force to override): default/hr-app
What is the current version of the cluster?
kubectl get nodes and look at the VERSION
What is the latest stable version of Kubernetes as of today?
kubeadm upgrade plan
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[preflight] Running pre-flight checks.
[upgrade] Running cluster health checks
[upgrade] Fetching available versions to upgrade to
[upgrade/versions] Cluster version: v1.26.0
[upgrade/versions] kubeadm version: v1.26.0
I1119 12:04:28.812883 18925 version.go:256] remote version is much newer: v1.28.4; falling back to: stable-1.26
[upgrade/versions] Target version: v1.26.11
[upgrade/versions] Latest version in the v1.26 series: v1.26.11
Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply':
COMPONENT CURRENT TARGET
kubelet 2 x v1.26.0 v1.26.11
Upgrade to the latest version in the v1.26 series:
COMPONENT CURRENT TARGET
kube-apiserver v1.26.0 v1.26.11
kube-controller-manager v1.26.0 v1.26.11
kube-scheduler v1.26.0 v1.26.11
kube-proxy v1.26.0 v1.26.11
CoreDNS v1.9.3 v1.9.3
etcd 3.5.6-0 3.5.6-0
its v1.28.4
as shown above
Upgrade the controlplane components to exact version v1.27.0
Upgrade the kubeadm tool (if not already), then the controlplane components, and finally the kubelet. Practice referring to the Kubernetes documentation page.
Note: While upgrading kubelet, if you hit dependency issues while running the apt-get upgrade kubelet command, use the apt install kubelet=1.27
.0-00 command instead.
# On the node01 node, run the following commands:
# If you are on the controlplane node
ssh node01 # to log in to the node01.
# This will update the package lists from the software repository.
apt-get update
# This will install the kubeadm version 1.27.0.
apt-get install kubeadm=1.27.0-00
# This will upgrade the node01 configuration.
kubeadm upgrade node
# This will update the kubelet with the version 1.27.0.
apt-get install kubelet=1.27.0-00
# You may need to reload the daemon and restart the kubelet service after it has been upgraded.
systemctl daemon-reload
systemctl restart kubelet
At what address can you reach the ETCD cluster from the controlplane node?
k -n kube-system describe po etcd-controlplane
Name: etcd-controlplane
Namespace: kube-system
Priority: 2000001000
Priority Class Name: system-node-critical
Node: controlplane/192.25.158.9
Start Time: Sun, 19 Nov 2023 12:39:56 -0500
Labels: component=etcd
tier=control-plane
Annotations: kubeadm.kubernetes.io/etcd.advertise-client-urls: https://192.25.158.9:2379
kubernetes.io/config.hash: 88719a0e6555d94fd96af8b6011a2af6
kubernetes.io/config.mirror: 88719a0e6555d94fd96af8b6011a2af6
kubernetes.io/config.seen: 2023-11-19T12:39:38.200107235-05:00
kubernetes.io/config.source: file
Status: Running
SeccompProfile: RuntimeDefault
IP: 192.25.158.9
IPs:
IP: 192.25.158.9
Controlled By: Node/controlplane
Containers:
etcd:
Container ID: containerd://f21102066ab677d48612ffc74802a43ae023daa92feeab805b0a80da2e53f495
Image: registry.k8s.io/etcd:3.5.7-0
Image ID: registry.k8s.io/etcd@sha256:51eae8381dcb1078289fa7b4f3df2630cdc18d09fb56f8e56b41c40e191d6c83
Port: <none>
Host Port: <none>
Command:
etcd
--advertise-client-urls=https://192.25.158.9:2379
--cert-file=/etc/kubernetes/pki/etcd/server.crt
--client-cert-auth=true
--data-dir=/var/lib/etcd
--experimental-initial-corrupt-check=true
--experimental-watch-progress-notify-interval=5s
--initial-advertise-peer-urls=https://192.25.158.9:2380
--initial-cluster=controlplane=https://192.25.158.9:2380
--key-file=/etc/kubernetes/pki/etcd/server.key
--listen-client-urls=https://127.0.0.1:2379,https://192.25.158.9:2379
--listen-metrics-urls=http://127.0.0.1:2381
--listen-peer-urls=https://192.25.158.9:2380
--name=controlplane
--peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
--peer-client-cert-auth=true
--peer-key-file=/etc/kubernetes/pki/etcd/peer.key
--peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
--snapshot-count=10000
--trusted-ca-file=/et
check the listen-client-urls
as shown above
Backup the Etcd
The master node in our cluster is planned for a regular maintenance reboot tonight. While we do not anticipate anything to go wrong, we are required to take the necessary backups. Take a snapshot of the ETCD database using the built-in snapshot functionality.
Store the backup file at location /opt/snapshot-pre-boot.db
ETCDCTL_API=3 etcdctl --endpoints=https://[127.0.0.1]:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
snapshot save /opt/snapshot-pre-boot.db
How many clusters are defined in the kubeconfig on the student-node?
k config get-clusters
NAME
cluster2
cluster1
How to switch from one cluster to another?
k config use-context cluster1
#If you check out the pods running in the kube-system namespace in cluster1, you will notice that etcd is running as a pod:
$ kubectl config use-context cluster1
Switched to context "cluster1".
$ kubectl get pods -n kube-system | grep etcd
etcd-cluster1-controlplane 1/1 Running 0 9m26s
# This means that ETCD is set up as a Stacked ETCD Topology where the distributed data storage cluster provided by etcd is stacked on top of the cluster formed by the nodes managed by kubeadm that run control plane components.
# Using the external etcd
If you check out the pods running in the kube-system namespace in cluster2, you will notice that there are NO etcd pods running in this cluster!
student-node ~ ➜ kubectl config use-context cluster2
Switched to context "cluster2".
student-node ~ ➜ kubectl get pods -n kube-system | grep etcd
student-node ~ ✖
Also, there is NO static pod configuration for etcd under the static pod path:
student-node ~ ✖ ssh cluster2-controlplane
Welcome to Ubuntu 18.04.6 LTS (GNU/Linux 5.4.0-1086-gcp x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
This system has been minimized by removing packages and content that are
not required on a system that users do not log into.
To restore this content, you can run the 'unminimize' command.
Last login: Wed Aug 31 05:05:04 2022 from 10.1.127.14
cluster2-controlplane ~ ➜ ls /etc/kubernetes/manifests/ | grep -i etcd
cluster2-controlplane ~ ✖
However, if you inspect the process on the controlplane for cluster2, you will see that that the process for the kube-apiserver is referencing an external etcd datastore:
cluster2-controlplane ~ ✖ ps -ef | grep etcd
root 1705 1320 0 05:03 ? 00:00:31 kube-apiserver --advertise-address=10.1.127.3 --allow-privileged=true --authorization-mode=Node,RBAC --client-ca-file=/etc/kubernetes/pki/ca.crt --enable-admission-plugins=NodeRestriction --enable-bootstrap-token-auth=true --etcd-cafile=/etc/kubernetes/pki/etcd/ca.pem --etcd-certfile=/etc/kubernetes/pki/etcd/etcd.pem --etcd-keyfile=/etc/kubernetes/pki/etcd/etcd-key.pem --etcd-servers=https://10.1.127.10:2379 --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key --requestheader-allowed-names=front-proxy-client --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group --requestheader-username-headers=X-Remote-User --secure-port=6443 --service-account-issuer=https://kubernetes.default.svc.cluster.local --service-account-key-file=/etc/kubernetes/pki/sa.pub --service-account-signing-key-file=/etc/kubernetes/pki/sa.key --service-cluster-ip-range=10.96.0.0/12 --tls-cert-file=/etc/kubernetes/pki/apiserver.crt --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
root 5754 5601 0 05:15 pts/0 00:00:00 grep etcd
cluster2-controlplane ~ ➜
# You can see the same information by inspecting the kube-apiserver pod (which runs as a static pod in the kube-system namespace):
What is the IP address of the External ETCD datastore used in cluster2?
ps -ef | grep etcd # after doing ssh to controlPlane
root 1747 1383 0 20:07 ? 00:05:56 kube-apiserver --advertise-address=192.28.229.12 --allow-privileged=true --authorization-mode=Node,RBAC --client-ca-file=/etc/kubernetes/pki/ca.crt --enable-admission-plugins=NodeRestriction --enable-bootstrap-token-auth=true --etcd-cafile=/etc/kubernetes/pki/etcd/ca.pem --etcd-certfile=/etc/kubernetes/pki/etcd/etcd.pem --etcd-keyfile=/etc/kubernetes/pki/etcd/etcd-key.pem --etcd-servers=https://192.28.229.24:2379 --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key --requestheader-allowed-names=front-proxy-client --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group --requestheader-username-headers=X-Remote-User --secure-port=6443 --service-account-issuer=https://kubernetes.default.svc.cluster.local --service-account-key-file=/etc/kubernetes/pki/sa.pub --service-account-signing-key-file=/etc/kubernetes/pki/sa.key --service-cluster-ip-range=10.96.0.0/12 --tls-cert-file=/etc/kubernetes/pki/apiserver.crt --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
IP is 192.28.229.24
What is the default data directory used the for ETCD datastore used in cluster1?
ps -ef | grep -i etcd
root 1867 1383 0 20:08 ? 00:02:25 etcd --advertise-client-urls=https://192.28.229.9:2379 --cert-file=/etc/kubernetes/pki/etcd/server.crt --client-cert-auth=true --data-dir=/var/lib/etcd --experimental-initial-corrupt-check=true --initial-advertise-peer-urls=https://192.28.229.9:2380 --initial-cluster=cluster1-controlplane=https://192.28.229.9:2380 --key-file=/etc/kubernetes/pki/etcd/server.key --listen-client-urls=https://127.0.0.1:2379,https://192.28.229.9:2379 --listen-metrics-urls=http://127.0.0.1:2381 --listen-peer-urls=https://192.28.229.9:2380 --name=cluster1-controlplane --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt --peer-client-cert-auth=true --peer-key-file=/etc/kubernetes/pki/etcd/peer.key --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt --snapshot-count=10000 --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
Ans is var/lib/etcd
---
# First set the context to cluster1:
$ kubectl config use-context cluster1
Switched to context "cluster1".
# Next, inspect the endpoints and certificates used by the etcd pod. We will make use of these to take the backup.
$ kubectl describe pods -n kube-system etcd-cluster1-controlplane | grep advertise-client-urls
--advertise-client-urls=https://10.1.218.16:2379
$ kubectl describe pods -n kube-system etcd-cluster1-controlplane | grep pki
--cert-file=/etc/kubernetes/pki/etcd/server.crt
--key-file=/etc/kubernetes/pki/etcd/server.key
--peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
--peer-key-file=/etc/kubernetes/pki/etcd/peer.key
--peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
--trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
/etc/kubernetes/pki/etcd from etcd-certs (rw)
Path: /etc/kubernetes/pki/etcd
# SSH to the controlplane node of cluster1 and then take the backup using the endpoints and certificates we identified above:
controlplane$
ETCDCTL_API=3 etcdctl \
--endpoints=https://10.1.220.8:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
snapshot save /opt/cluster1.db
Snapshot saved at /opt/cluster1.db
# Finally, copy the backup to the student-node. To do this, go back to the student-node and use scp as shown below:
$ scp cluster1-controlplane:/opt/cluster1.db /opt
An ETCD backup for cluster2 is stored at /opt/cluster2.db
. Use this snapshot file to carryout a restore on cluster2 to a new path /var/lib/etcd-data-new
# Step 1. Copy the snapshot file from the student-node to the etcd-server. In the example below, we are copying it to the /root directory:
student-node ~ scp /opt/cluster2.db etcd-server:/root
cluster2.db 100% 1108KB 178.5MB/s 00:00
student-node ~ ➜
# Step 2: Restore the snapshot on the cluster2. Since we are restoring directly on the etcd-server, we can use the endpoint https:/127.0.0.1. Use the same certificates that were identified earlier. Make sure to use the data-dir as /var/lib/etcd-data-new:
etcd-server ~ ➜ ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/etc/etcd/pki/ca.pem --cert=/etc/etcd/pki/etcd.pem --key=/etc/etcd/pki/etcd-key.pem snapshot restore /root/cluster2.db --data-dir /var/lib/etcd-data-new
{"level":"info","ts":1662004927.2399247,"caller":"snapshot/v3_snapshot.go:296","msg":"restoring snapshot","path":"/root/cluster2.db","wal-dir":"/var/lib/etcd-data-new/member/wal","data-dir":"/var/lib/etcd-data-new","snap-dir":"/var/lib/etcd-data-new/member/snap"}
{"level":"info","ts":1662004927.2584803,"caller":"membership/cluster.go:392","msg":"added member","cluster-id":"cdf818194e3a8c32","local-member-id":"0","added-peer-id":"8e9e05c52164694d","added-peer-peer-urls":["http://localhost:2380"]}
{"level":"info","ts":1662004927.264258,"caller":"snapshot/v3_snapshot.go:309","msg":"restored snapshot","path":"/root/cluster2.db","wal-dir":"/var/lib/etcd-data-new/member/wal","data-dir":"/var/lib/etcd-data-new","snap-dir":"/var/lib/etcd-data-new/member/snap"}
etcd-server ~ ➜
# Step 3: Update the systemd service unit file for etcdby running vi /etc/systemd/system/etcd.service and add the new value for data-dir:
[Unit]
Description=etcd key-value store
Documentation=https://github.com/etcd-io/etcd
After=network.target
[Service]
User=etcd
Type=notify
ExecStart=/usr/local/bin/etcd \
--name etcd-server \
--data-dir=/var/lib/etcd-data-new \
---End of Snippet---
# Step 4: make sure the permissions on the new directory is correct (should be owned by etcd user):
etcd-server /var/lib ➜ chown -R etcd:etcd /var/lib/etcd-data-new
etcd-server /var/lib ➜
etcd-server /var/lib ➜ ls -ld /var/lib/etcd-data-new/
drwx------ 3 etcd etcd 4096 Sep 1 02:41 /var/lib/etcd-data-new/
etcd-server /var/lib ➜
# Step 5: Finally, reload and restart the etcd service.
etcd-server ~/default.etcd ➜ systemctl daemon-reload
etcd-server ~ ➜ systemctl restart etcd
# Step 6 (optional): It is recommended to restart controlplane components (e.g. kube-scheduler, kube-controller-manager, kubelet) to ensure that they don't rely on some stale data.
Security¶
View Cert Details¶
Identify the certificate file used for the kube-api server and Identify the Certificate file used to authenticate kube-apiserver as a client to ETCD Server
controlplane /etc/kubernetes/pki ➜ ls /etc/kubernetes/pki/ | grep .crt
apiserver.crt
apiserver-etcd-client.crt
apiserver-kubelet-client.crt
ca.crt
front-proxy-ca.crt
front-proxy-client.crt
Ans is apiserver.crt
for 1st question and apiserver-etcd-client.crt
for 2nd
controlplane /etc/kubernetes/pki ➜ ls /etc/kubernetes/pki/ | grep .key
apiserver-etcd-client.key
apiserver.key
apiserver-kubelet-client.key # key used to authenticate kubeapi-server to the kubelet server.
ca.key
front-proxy-ca.key
front-proxy-client.key
sa.key
Identify the ETCD Server Certificate used to host ETCD server
# TIP: Look for cert-file option in the file /etc/kubernetes/manifests/etcd.yaml.
controlplane /etc/kubernetes/manifests ➜ cat etcd.yaml | grep -i .crt
- --cert-file=/etc/kubernetes/pki/etcd/server.crt # answer
- --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
- --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
- --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
Identify the ETCD Server CA Root Certificate used to serve ETCD Server
ETCD can have its own CA. So this may be a different CA certificate than the one used by kube-api server.
# TIP: Look for CA Certificate (trusted-ca-file) in file /etc/kubernetes/manifests/etcd.yaml.
controlplane /etc/kubernetes/manifests ➜ cat etcd.yaml | grep -i .crt
- --cert-file=/etc/kubernetes/pki/etcd/server.crt
- --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
- --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
- --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt # ans
What is the Common Name (CN) configured on the Kube API Server Certificate?
OpenSSL Syntax: openssl x509 -in file-path.crt -text -noout
# TIP: Run the command openssl x509 -in /etc/kubernetes/pki/apiserver.crt -text and look for Subject CN.
controlplane /etc/kubernetes/pki ✖ openssl x509 -in apiserver.crt -text | grep -i cn
Issuer: CN = kubernetes # Name of CA who issued the cert
Subject: CN = kube-apiserver # What is the Common Name (CN) configured on the Kube API Server
MIIDjDCCAnSgAwIBAgIIVPn/5jFVfAMwDQYJKoZIhvcNAQELBQAwFTETMBEGA1UE
nUyXcccNRLfQfrhu9NoD+4Nq7gM99y5QRpD8QimBnv1DBzXk+XWoC2Ka3EpmRzZZ
Which of the below alternate names is not configured on the Kube API Server Certificate?
#TIP: Run the command openssl x509 -in /etc/kubernetes/pki/apiserver.crt -text and look at Alternative Names as shown below
X509v3 Subject Alternative Name:
DNS:controlplane, DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, IP Address:10.96.0.1, IP Address:192.2.128.9
How long, from the issued date, is the Kube-API Server Certificate valid for?
#TIP: Run the command openssl x509 -in /etc/kubernetes/pki/apiserver.crt -text and check on the Expiry date.
Issuer: CN = etcd-ca
Validity
Not Before: Nov 20 00:50:29 2023 GMT
Not After : Nov 19 00:50:29 2024 GMT
How long, from the issued date, is the Root CA Certificate valid for?
#TIP: Run the command openssl x509 -in /etc/kubernetes/pki/ca.crt -text and look for the validity.
Data:
Version: 3 (0x2)
Serial Number: 0 (0x0)
Signature Algorithm: sha256WithRSAEncryption
Issuer: CN = kubernetes
Validity
Not Before: Nov 20 00:50:28 2023 GMT
Not After : Nov 17 00:50:28 2033 GMT
Subject: CN = kubernetes
The kube-api server stopped again! Check it out. Inspect the kube-api server logs and identify the root cause and fix the issue
Run crictl ps -a command to identify the kube-api server container. Run crictl logs container-id command to view the logs.
crictl ps -a
CONTAINER IMAGE CREATED STATE NAME ATTEMPT POD ID POD
3bac921cb4a5a 6f707f569b572 16 seconds ago Running kube-apiserver 0 fec26f7c6715a kube-apiserver-controlplane
33527b14bc1bf f73f1b39c3fe8 22 seconds ago Running kube-scheduler 2 fb3dc26664afc kube-scheduler-controlplane
627a986414afe 95fe52ed44570 25 seconds ago Running kube-controller-manager 2 c674e558cf141 kube-controller-manager-controlplane
5d7b588da6e90 86b6af7dd652c About a minute ago Running etcd 0 efd0a8d9c600e etcd-controlplane
762ccbdfde7a8 f73f1b39c3fe8 4 minutes ago Exited kube-scheduler 1 fb3dc26664afc kube-scheduler-controlplane
b3e4272eefb7b 95fe52ed44570 4 minutes ago Exited kube-controller-manager 1 c674e558cf141 kube-controller-manager-controlplane
7fa9a78979c3e ead0a4a53df89 35 minutes ago Running coredns 0 c23b177006c7a coredns-5d78c9869d-p8tjq
bf84f7ebb5d43 ead0a4a53df89 35 minutes ago Running coredns 0 274222b10e501 coredns-5d78c9869d-84jrn
247e828f3e5e3 8b675dda11bb1 35 minutes ago Running kube-flannel 0 579d556b90be1 kube-flannel-ds-6xdvn
b412cc3976452 8b675dda11bb1 35 minutes ago Exited install-cni 0 579d556b90be1 kube-flannel-ds-6xdvn
63734700d5255 fcecffc7ad4af 35 minutes ago Exited install-cni-plugin 0 579d556b90be1 kube-flannel-ds-6xdvn
7c4e662a2827d 5f82fc39fa816 35 minutes ago Running kube-proxy 0 60aabd71d5180 kube-proxy-hs4c4
crictl logs da40e86464c04
I1120 01:29:05.406801 1 server.go:551] external host was not specified, using 192.2.128.9
I1120 01:29:05.407768 1 server.go:165] Version: v1.27.0
I1120 01:29:05.407793 1 server.go:167] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
I1120 01:29:05.685211 1 shared_informer.go:311] Waiting for caches to sync for node_authorizer
I1120 01:29:05.694542 1 plugins.go:158] Loaded 12 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolerationSeconds,DefaultStorageClass,StorageObjectInUseProtection,RuntimeClass,DefaultIngressClass,MutatingAdmissionWebhook.
I1120 01:29:05.694560 1 plugins.go:161] Loaded 13 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,PodSecurity,Priority,PersistentVolumeClaimResize,RuntimeClass,CertificateApproval,CertificateSigning,ClusterTrustBundleAttest,CertificateSubjectRestriction,ValidatingAdmissionPolicy,ValidatingAdmissionWebhook,ResourceQuota.
W1120 01:29:05.700731 1 logging.go:59] [core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to {
"Addr": "127.0.0.1:2379",
"ServerName": "127.0.0.1",
"Attributes": null,
"BalancerAttributes": null,
"Type": 0,
"Metadata": null
}. Err: connection error: desc = "transport: authentication handshake failed: tls: failed to verify certificate: x509: certificate signed by unknown authority"
W1120 01:29:06.688471 1 logging.go:59] [core] [Channel #4 SubChannel #6] grpc: addrConn.createTransport failed to connect to {
"Addr": "127.0.0.1:2379",
"ServerName": "127.0.0.1",
"Attributes": null,
"BalancerAttributes": null,
"Type": 0,
"Metadata": null
Here specify the right ca cert
file
A new member akshay joined our team. He requires access to our cluster
The Certificate Signing Request is at the /root location.
Use this command to generate the base64 encoded format as following: -
cat akshay.csr | base64 -w 0
Finally, save the below YAML in a file and create a CSR name akshay as follows: -
---
apiVersion: certificates.k8s.io/v1
kind: CertificateSigningRequest
metadata:
name: akshay
spec:
groups:
- system:authenticated
request: <Paste the base64 encoded value of the CSR file>
signerName: kubernetes.io/kube-apiserver-client
usages:
- client auth
kubectl apply -f akshay-csr.yaml
Sample shown below
controlplane ~ ➜ cat csr-mani.yaml
apiVersion: certificates.k8s.io/v1
kind: CertificateSigningRequest
metadata:
name: akshay
spec:
request: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURSBSRVFVRVNULS0tLS0KTUlJQ1ZqQ0NBVDRDQVFBd0VURVBNQTBHQTFVRUF3d0dZV3R6YUdGNU1JSUJJakFOQmdrcWhraUc5dzBCQVFFRgpBQU9DQVE4QU1JSUJDZ0tDQVFFQTc2WjA5ZlgzLzhaanB1Rlc2aE9xeG1tYW1qb3VRVDZvSDJ0WHltd0xVOGx5Cmg4dTQwV1RtTVRxbzk4Kzk0a3lnOTdKUFRWbDdsWkNRbkZKdmlpTlAzVlRRa0tOU3FOakQzcGRESUxsUXErcHQKeDV2bXhhcUxmTlZocEt5QzdkZlk1L1VEZHNPT05CYit4dWNkNmx4YU5kdTJqMml4alF2aisyOXdRdExvaUYxNQpQNDZ5NkQ0c1dnb04zcWc4Y1RhNTRNcnRPc1FBem1CZHdQcnVXNXFlODBNaGMrQk9HWmx2YlZPcmIzREVINmFOCmNTMzA2SGlwUzl5TkpOMzArdThwd1FtcS9QM0JneHJuOS9DNkhPY1JiaHQ0WTE2Q2hjZUk3anFjcVRHbithcE4KemgxeDN2ZGg3dVNCam1Pb3JsNVpTYW1FcHhOdnBpVkdqNUZMaVBYYW93SURBUUFCb0FBd0RRWUpLb1pJaHZjTgpBUUVMQlFBRGdnRUJBRFpINmFtcXdEMjZDM1dwVjlKNzM1N3hibGhIQVkrK3FOSzR6Qk1jZE01dnUwV1VYK3dGCkFRd2czREFORW52UThMdWJnV3RLaEkwUGxPbjRWK1JSZzIxK01qUFhsUzNDWkZodEN6VE9oY0hwUGVBQnZZQnEKWkthTHBTTVlTdEVqYnNsWGg1dVhiZmxMRHBRSllZNEdTc2tXRStsZnVzTUNyNFhGNzNSVUNFWHdHZGFFNFdpcAp6WGFsb0x1ZFdneGFmVlRSR1JWK2RKMXNuV2pMaWRySVU3NDZxUVZiUW1Gc0pWU2VaTjZNSGRiU0xIZFZNZjJMCmR3dThNcUVpRUQzeUhJT2dCdlowaWJQT1VtMmFrZVFUT2F5cEhCQVJJanRSYVA5cnhuYUM4ckNZK0czb0MvdEQKY1hpajk3ak5pRzU0UWJMMEN5NUdqcEQ2YndTek5iM0dpN3c9Ci0tLS0tRU5EIENFUlRJRklDQVRFIFJFUVVFU1QtLS0tLQo=
signerName: kubernetes.io/kube-apiserver-client
expirationSeconds: 86400 # one day
usages:
- client auth
csr is created as shown below
controlplane ~ ➜ k apply -f csr-mani.yaml
certificatesigningrequest.certificates.k8s.io/akshay created
check the CSR's
controlplane ~ ➜ k get csr
NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION
akshay 73s kubernetes.io/kube-apiserver-client kubernetes-admin 24h Pending
csr-8wnnl 17m kubernetes.io/kube-apiserver-client-kubelet system:node:controlplane <none> Approved,Issued
Please approve the CSR
controlplane ~ ✖ k certificate approve akshay
certificatesigningrequest.certificates.k8s.io/akshay approved
Describe the new CSR you got
# TIP: use the get -o yaml instead of describe
controlplane ~ ➜ k get csr agent-smith -o yaml
apiVersion: certificates.k8s.io/v1
kind: CertificateSigningRequest
metadata:
creationTimestamp: "2023-11-20T02:04:44Z"
name: agent-smith
resourceVersion: "2170"
uid: 8eeaf351-96ac-488a-8d83-cab1edc14605
spec:
groups:
- system:masters
- system:authenticated
request: XXXX_OMITTED
signerName: kubernetes.io/kube-apiserver-client
usages:
- digital signature
- key encipherment
- server auth
username: agent-x
status: {}
Kubeconfig¶
Where is the default kubeconfig file located in the current environment?
/root/.kube/config
How many clusters are defined in the default kubeconfig file?
# Run the kubectl config view command and count the number of clusters.
controlplane ~/.kube ➜ k config view
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: DATA+OMITTED
server: https://controlplane:6443
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: kubernetes-admin
name: kubernetes-admin@kubernetes
current-context: kubernetes-admin@kubernetes
kind: Config
preferences: {}
users: # numnber of users
- name: kubernetes-admin
user:
client-certificate-data: DATA+OMITTED
client-key-data: DATA+OMITTED
Lets check the kube config file
~/.kube ➜ cat /root/my-kube-config
apiVersion: v1
kind: Config
clusters: # 4 clusters are configured
- name: production
cluster:
certificate-authority: /etc/kubernetes/pki/ca.crt
server: https://controlplane:6443
- name: development
cluster:
certificate-authority: /etc/kubernetes/pki/ca.crt
server: https://controlplane:6443
- name: kubernetes-on-aws
cluster:
certificate-authority: /etc/kubernetes/pki/ca.crt
server: https://controlplane:6443
- name: test-cluster-1
cluster:
certificate-authority: /etc/kubernetes/pki/ca.crt
server: https://controlplane:6443
contexts:
- name: test-user@development
context:
cluster: development
user: test-user
- name: aws-user@kubernetes-on-aws
context:
cluster: kubernetes-on-aws
user: aws-user
- name: test-user@production
context:
cluster: production
user: test-user
- name: research
context:
cluster: test-cluster-1
user: dev-user # user for research context
users:
- name: test-user
user:
client-certificate: /etc/kubernetes/pki/users/test-user/test-user.crt
client-key: /etc/kubernetes/pki/users/test-user/test-user.key
- name: dev-user
user:
client-certificate: /etc/kubernetes/pki/users/dev-user/developer-user.crt
client-key: /etc/kubernetes/pki/users/dev-user/dev-user.key
- name: aws-user
user:
client-certificate: /etc/kubernetes/pki/users/aws-user/aws-user.crt
client-key: /etc/kubernetes/pki/users/aws-user/aws-user.key
current-context: test-user@development # current context
preferences: {}
I would like to use the dev-user to access test-cluster-1. Set the current context to the right one so I can do that
# TIP: use the right context file as well
controlplane ~ ➜ k config --kubeconfig /root/my-kube-config use-context research
Switched to context "research".
# Test it
controlplane ~ ➜ k config --kubeconfig /root/my-kube-config current-context
research
RBAC¶
Role-based access control (RBAC) is a method of regulating access to computer or network resources based on the roles of individual users within your organization.
The RBAC API declares four kinds of Kubernetes object: Role, ClusterRole, RoleBinding and ClusterRoleBinding.
An RBAC Role or ClusterRole contains rules that represent a set of permissions. Permissions are purely additive (there are no "deny" rules). A Role always sets permissions within a particular namespace; when you create a Role, you have to specify the namespace it belongs in. ClusterRole, by contrast, is a non-namespaced resource. The resources have different names (Role and ClusterRole) because a Kubernetes object always has to be either namespaced or not namespaced; it can't be both.
# Read role
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: default
name: pod-reader
rules:
- apiGroups: [""] # "" indicates the core API group
resources: ["pods"]
verbs: ["get", "watch", "list"]
ClusterRole example shown below
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
# "namespace" omitted since ClusterRoles are not namespaced
name: secret-reader
rules:
- apiGroups: [""]
#
# at the HTTP level, the name of the resource for accessing Secret
# objects is "secrets"
resources: ["secrets"]
verbs: ["get", "watch", "list"]
A role binding grants the permissions defined in a role to a user or set of users. It holds a list of subjects (users, groups, or service accounts), and a reference to the role being granted. A RoleBinding grants permissions within a specific namespace whereas a ClusterRoleBinding grants that access cluster-wide.
apiVersion: rbac.authorization.k8s.io/v1
# This role binding allows "jane" to read pods in the "default" namespace.
# You need to already have a Role named "pod-reader" in that namespace.
kind: RoleBinding
metadata:
name: read-pods
namespace: default
subjects:
# You can specify more than one "subject"
- kind: User
name: jane # "name" is case sensitive
apiGroup: rbac.authorization.k8s.io
roleRef:
# "roleRef" specifies the binding to a Role / ClusterRole
kind: Role #this must be Role or ClusterRole
name: pod-reader # this must match the name of the Role or ClusterRole you wish to bind to
apiGroup: rbac.authorization.k8s.io
Using ClusterRole in RoleBinding
A RoleBinding can also reference a ClusterRole to grant the permissions defined in that ClusterRole to resources inside the RoleBinding's namespace. This kind of reference lets you define a set of common roles across your cluster, then reuse them within multiple namespaces.
To grant permissions across a whole cluster, you can use a ClusterRoleBinding. The following ClusterRoleBinding allows any user in the group "manager" to read secrets in any namespace.
apiVersion: rbac.authorization.k8s.io/v1
# This cluster role binding allows anyone in the "manager" group to read secrets in any namespace.
kind: ClusterRoleBinding
metadata:
name: read-secrets-global
subjects:
- kind: Group
name: manager # Name is case sensitive
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: secret-reader
apiGroup: rbac.authorization.k8s.io
Question
Inspect the environment and identify the authorization modes configured on the cluster.
# Use the command kubectl describe pod kube-apiserver-controlplane -n kube-system and look for --authorization-mode.
controlplane ~ ➜ k -n kube-system describe po kube-apiserver-controlplane | grep -i auth
--authorization-mode=Node,RBAC
--enable-bootstrap-token-auth=true
What are the resources the kube-proxy role in the kube-system namespace is given access to?
controlplane ~ ➜ k describe role kube-proxy -n kube-system
Name: kube-proxy
Labels: <none>
Annotations: <none>
PolicyRule:
Resources Non-Resource URLs Resource Names Verbs
--------- ----------------- -------------- -----
configmaps [] [kube-proxy] [get]
Which account is the kube-proxy role assigned to?
controlplane ~ ➜ kubectl describe rolebinding kube-proxy -n kube-system
Name: kube-proxy
Labels: <none>
Annotations: <none>
Role:
Kind: Role
Name: kube-proxy
Subjects:
Kind Name Namespace
---- ---- ---------
Group system:bootstrappers:kubeadm:default-node-token
A user dev-user is created. User's details have been added to the kubeconfig file. Inspect the permissions granted to the user. Check if the user can list pods in the default namespace.
k get pods --as dev-user
Create the necessary roles and role bindings required for the dev-user to create, list and delete pods in the default namespace
controlplane ~ ✖ kubectl create role developer --verb=create --verb=list --verb=delete --resource=pods
role.rbac.authorization.k8s.io/developer created
Create a binding for it as shown below
controlplane ~ ✖ kubectl create rolebinding dev-user-binding --role=developer --user=dev-user --namespace=default
rolebinding.rbac.authorization.k8s.io/dev-user-binding created
controlplane ~ ➜ k get pods --as dev-user
NAME READY STATUS RESTARTS AGE
red-697496b845-2srbh 1/1 Running 0 18m
red-697496b845-n4zsd 1/1 Running 0 18m
What user/groups are the cluster-admin role bound to?
The ClusterRoleBinding for the role is with the same name.
controlplane ~ ➜ k get clusterrolebindings.rbac.authorization.k8s.io cluster-admin -o yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
creationTimestamp: "2023-11-21T01:48:21Z"
labels:
kubernetes.io/bootstrapping: rbac-defaults
name: cluster-admin
resourceVersion: "134"
uid: 4330de0f-ef56-42ef-8ea9-c2bb3e26f4a2
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: system:masters # answer
What level of permission does the cluster-admin role grant?
controlplane ~ ➜ k get clusterrole cluster-admin -o yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
creationTimestamp: "2023-11-21T01:48:20Z"
labels:
kubernetes.io/bootstrapping: rbac-defaults
name: cluster-admin
resourceVersion: "72"
uid: c5bab975-ad5f-48bb-837c-65aae7200b9e
rules:
- apiGroups:
- '*'
resources:
- '*'
verbs:
- '*'
- nonResourceURLs:
- '*'
verbs:
- '*'
Answer: Perform any role on any resource in the cluster
A new user michelle joined the team. She will be focusing on the nodes in the cluster. Create the required ClusterRoles and ClusterRoleBindings so she gets access to the nodes
controlplane ~ ➜ kubectl create clusterrole node-reader --verb=get,list,watch --resource=nodes
clusterrole.rbac.authorization.k8s.io/node-reader created
controlplane ~ ✖ kubectl create clusterrolebinding node-reader-cluster-binding --clusterrole=node-reader --user=michelle
clusterrolebinding.rbac.authorization.k8s.io/node-reader-cluster-binding created
Michelle's responsibilities are growing and now she will be responsible for storage as well. Create the required ClusterRoles and ClusterRoleBindings to allow her access to Storage
Get the API groups and resource names from command kubectl api-resources. Use the given spec:
ClusterRole: storage-admin
Resource: persistentvolumes
Resource: storageclasses
ClusterRoleBinding: michelle-storage-admin
ClusterRoleBinding Subject: michelle
ClusterRoleBinding Role: storage-admin
Answer is shown below
controlplane ~ ➜ kubectl create clusterrole storage-admin --verb=get,list,watch --resource=persistentvolumes --verb=get,list --resource=storageclasses
clusterrole.rbac.authorization.k8s.io/storage-admin created
controlplane ~ ➜ kubectl create clusterrolebinding michelle-storage-admin --clusterrole=storage-admin --user=michelle
clusterrolebinding.rbac.authorization.k8s.io/michelle-storage-admin created
Service Accounts¶
-
A service account is a type of non-human account that, in Kubernetes, provides a distinct identity in a Kubernetes cluster.
-
Application Pods, system components, and entities inside and outside the cluster can use a specific ServiceAccount's credentials to identify as that ServiceAccount.
-
Each service account is bound to a Kubernetes namespace. Every namespace gets a default ServiceAccount upon creation.
Which sercive account is used by deployment?
controlplane ~ ➜ k get po web-dashboard-97c9c59f6-x2zdd -o yaml | grep -i service
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
enableServiceLinks: true
serviceAccount: default
serviceAccountName: default
- serviceAccountToken:
The application needs a ServiceAccount with the Right permissions to be created to authenticate to Kubernetes. The default ServiceAccount has limited access. Create a new ServiceAccount named dashboard-sa.
controlplane ~ ➜ k create sa dashboard-sa
serviceaccount/dashboard-sa created
create a new token for a SA
controlplane ~ ➜ k create token dashboard-sa
Image Security¶
What secret type must we choose for docker registry?
root@controlplane ~ ➜ k create secret --help
Create a secret using specified subcommand.
Available Commands:
docker-registry Create a secret for use with a Docker registry # answer
generic Create a secret from a local file, directory, or literal value
tls Create a TLS secret
Usage:
kubectl create secret [flags] [options]
We decided to use a modified version of the application from an internal private registry. Update the image of the deployment to use a new image from myprivateregistry.com:5000
# update the image as shown below
spec:
containers:
- image: myprivateregistry.com:5000/nginx:alpine
imagePullPolicy: IfNotPresent
Create a secret object with the credentials required to access the registry.
Name: private-reg-cred
Username: dock_user
Password: dock_password
Server: myprivateregistry.com:5000
Email: dock_user@myprivateregistry.com
Create using below
root@controlplane ~ ➜ kubectl create secret docker-registry private-reg-cred --docker-email=dock_user@myprivateregistry.com --docker-username=dock_user --docker-password=dock_password --docker-server=myprivateregistry.com:5000
secret/private-reg-cred created
Configure the deployment to use credentials from the new secret to pull images from the private registry
# Add the imagepull secret as shown below
spec:
containers:
- image: myprivateregistry.com:5000/nginx:alpine
imagePullPolicy: IfNotPresent
name: nginx
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
imagePullSecrets:
- name: private-reg-cred
Security Contexts¶
A security context defines privilege and access control settings for a Pod or Container. Security context settings include, but are not limited to:
Discretionary Access Control: Permission to access an object, like a file, is based on user ID (UID) and group ID (GID).
Security Enhanced Linux (SELinux): Objects are assigned security labels.
Running as privileged or unprivileged.
Linux Capabilities: Give a process some privileges, but not all the privileges of the root user.
# Sample Security Context
apiVersion: v1
kind: Pod
metadata:
name: security-context-demo
spec:
securityContext: # For container
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
volumes:
- name: sec-ctx-vol
emptyDir: {}
containers:
- name: sec-ctx-demo
image: busybox:1.28
command: [ "sh", "-c", "sleep 1h" ]
volumeMounts:
- name: sec-ctx-vol
mountPath: /data/demo
securityContext: # for Pod
allowPrivilegeEscalation: false
What is the user used to execute the sleep process within the ubuntu-sleeper pod?
# Check the user by checking the security context or doing the exec -it
controlplane ~ ➜ k get po ubuntu-sleeper -o yaml | grep -i securi
securityContext: {}
controlplane ~ ➜ k exec -it ubuntu-sleeper -- /bin/bash
root@ubuntu-sleeper:/# whoami
root
Edit the pod ubuntu-sleeper to run the sleep process with user ID 1010
spec:
containers:
- command:
- sleep
- "4800"
image: ubuntu
imagePullPolicy: Always
name: ubuntu
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-grl6f
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
nodeName: controlplane
preemptionPolicy: PreemptLowerPriority
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
runAsUser: 1010 ## add this
A Pod definition file named multi-pod.yaml is given. With what user are the processes in the web container started?
controlplane ~ ➜ cat multi-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: multi-pod
spec:
securityContext:
runAsUser: 1001
containers:
- image: ubuntu
name: web
command: ["sleep", "5000"]
securityContext:
runAsUser: 1002 # ans is this as local will override the global user
- image: ubuntu
name: sidecar
command: ["sleep", "5000"]
Update pod ubuntu-sleeper to run as Root user and with the SYS_TIME capability
controlplane ~ ➜ cat multi-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: ubuntu-sleeper
spec:
containers:
- image: ubuntu
name: web
command: ["sleep", "5000"]
securityContext: # added to the container, not the pod
capabilities:
add: ["SYS_TIME"]
Network Policies¶
If you want to control traffic flow at the IP address or port level for TCP, UDP, and SCTP protocols, then you might consider using Kubernetes NetworkPolicies for particular applications in your cluster.
NetworkPolicies are an application-centric construct which allow you to specify how a pod is allowed to communicate with various network "entities" (we use the word "entity" here to avoid overloading the more common terms such as "endpoints" and "services", which have specific Kubernetes connotations) over the network
- By default, a pod is non-isolated for egress; all outbound connections are allowed.
- By default, a pod is non-isolated for ingress; all inbound connections are allowed.
Network Plugin
`Network policies`` are implemented by the network plugin. To use network policies, you must be using a networking solution which supports NetworkPolicy. Creating a NetworkPolicy resource without a controller that implements it will have no effect.
# Network Policy Example
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: test-network-policy
namespace: default
spec:
podSelector:
matchLabels:
role: db
policyTypes:
- Ingress
- Egress
ingress:
- from:
- ipBlock:
cidr: 172.17.0.0/16
except:
- 172.17.1.0/24
- namespaceSelector:
matchLabels:
project: myproject
- podSelector:
matchLabels:
role: frontend
ports:
- protocol: TCP
port: 6379
egress:
- to:
- ipBlock:
cidr: 10.0.0.0/24
ports:
- protocol: TCP
port: 5978
Default deny egress
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-egress
spec:
podSelector: {}
policyTypes:
- Egress
Default allow egress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-all-egress
spec:
podSelector: {}
egress:
- {} # egress is defined here
policyTypes:
- Egress
Default deny all ingress and all egress traffic
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
What is meaning of this policy?
controlplane ~ ➜ k describe networkpolicies.networking.k8s.io payroll-policy
Name: payroll-policy
Namespace: default
Created on: 2023-11-20 22:47:02 -0500 EST
Labels: <none>
Annotations: <none>
Spec:
PodSelector: name=payroll
Allowing ingress traffic:
To Port: 8080/TCP
From:
PodSelector: name=internal # traffic from internal pod to payroll pod is allowed
Not affecting egress traffic
Policy Types: Ingress
Use the spec given below. You might want to enable ingress traffic to the pod to test your rules in the UI
Policy Name: internal-policy
Policy Type: Egress
Egress Allow: payroll
Payroll Port: 8080
Egress Allow: mysql
MySQL Port: 3306
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: internal-policy
namespace: default
spec:
policyTypes:
- Ingress
- Egress
ingress:
- {}
egress:
- to:
- podSelector:
matchLabels:
name: payroll
ports:
- protocol: TCP
port: 8080
- to:
- podSelector:
matchLabels:
name: mysql
ports:
- protocol: TCP
port: 3306
Storage¶
PV and PVC¶
A PersistentVolume (PV) is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes. It is a resource in the cluster just like a node is a cluster resource. PVs are volume plugins like Volumes, but have a lifecycle independent of any individual Pod that uses the PV.
A PersistentVolumeClaim (PVC) is a request for storage by a user. It is similar to a Pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany or ReadWriteMany
Reclaim Policy
When a user is done with their volume, they can delete the PVC objects from the API that allows reclamation of the resource. The reclaim policy for a PersistentVolume tells the cluster what to do with the volume after it has been released of its claim. Currently, volumes can either be Retained, Recycled, or Deleted.
- Retain reclaim policy allows for manual reclamation of the resource. When the PersistentVolumeClaim is deleted, the PersistentVolume still exists and the volume is considered "released". But it is not yet available for another claim because the previous claimant's data remains on the volume.
- For volume plugins that support the Delete reclaim policy, deletion removes both the PersistentVolume object from Kubernetes, as well as the associated storage asset in the external infrastructure, such as an AWS EBS or GCE PD volume.
- If supported by the underlying volume plugin, the Recycle reclaim policy performs a basic scrub (rm -rf /thevolume/*) on the volume and makes it available again for a new claim.
Configure a volume to store these logs at /var/log/webapp on the host using
Name: webapp Image Name: kodekloud/event-simulator Volume HostPath: /var/log/webapp Volume Mount: /log
controlplane ~ ✖ cat po.yaml
apiVersion: v1
kind: Pod
metadata:
name: webapp
namespace: default
spec:
containers:
- image: kodekloud/event-simulator
imagePullPolicy: Always
name: event-simulator
resources: {}
volumeMounts:
- mountPath: /log
name: log-vol
readOnly: true
volumes:
- name: log-vol
hostPath:
# directory location on host
path: /var/log/webapp
# this field is optional
type: Directory
Create a Persistent Volume with the given specification
Volume Name: pv-log
Storage: 100Mi
Access Modes: ReadWriteMany
Host Path: /pv/log
Reclaim Policy: Retain
controlplane ~ ➜ cat pv1.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-log
spec:
capacity:
storage: 100Mi
hostPath:
path: /pv/log
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
Let us claim some of that storage for our application. Create a Persistent Volume Claim with the given specification
Volume Name: claim-log-1 Storage Request: 50Mi Access Modes: ReadWriteOnce
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: claim-log-1
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Mi
Check if the claim is bound or not?
controlplane ~ ✖ k get pv,pvc
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/pv-log 100Mi RWX Retain Bound default/claim-log-1 5m56s
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/claim-log-1 Bound pv-log 100Mi RWX 13s
Update the webapp pod to use the persistent volume claim as its storage
Replace hostPath configured earlier with the newly created PersistentVolumeClaim.
Name: webapp
Image Name: kodekloud/event-simulator
Volume: PersistentVolumeClaim=claim-log-1
Volume Mount: /log
apiVersion: v1
kind: Pod
metadata:
name: webapp
namespace: default
spec:
containers:
- image: kodekloud/event-simulator
imagePullPolicy: Always
name: event-simulator
resources: {}
volumeMounts:
- mountPath: /log
name: pv-claim
volumes:
- name: pv-claim
persistentVolumeClaim:
claimName: claim-log-1
Storage Class¶
A StorageClass provides a way for administrators to describe the "classes" of storage they offer. Different classes might map to quality-of-service levels, or to backup policies, or to arbitrary policies determined by the cluster administrators.
When a PVC does not specify a storageClassName, the default StorageClass is used. The cluster can only have one default StorageClass
# SC example
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: standard
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
reclaimPolicy: Retain
allowVolumeExpansion: true
mountOptions:
- debug
volumeBindingMode: Immediate
Create a new PersistentVolumeClaim by the name of local-pvc that should bind to the volume local-pv
Inspect the pv local-pv for the specs.
PVC: local-pvc
Correct Access Mode?
Correct StorageClass Used?
PVC requests volume size = 500Mi?
controlplane ~ ➜ cat pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: local-pvc
spec:
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
resources:
requests:
storage: 500Mi
storageClassName: local-storage
Why is the PVC in a pending state despite making a valid request to claim the volume called local-pv?
# The StorageClass used by the PVC uses WaitForFirstConsumer volume binding mode. This means that the persistent volume will not bind to the claim until a pod makes use of the PVC to request storage.
controlplane ~ ✖ k describe pvc local-pvc | grep -A4 Events
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal WaitForFirstConsumer 11s (x16 over 3m47s) persistentvolume-controller waiting for first consumer to be created before binding
Create a new pod called nginx with the image nginx:alpine. The Pod should make use of the PVC local-pvc and mount the volume at the path /var/www/html
The PV local-pv should be in a bound state.
Pod created with the correct Image?
Pod uses PVC called local-pvc?
local-pv bound?
nginx pod running?
Volume mounted at the correct path?
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: nginx
name: nginx
spec:
containers:
- image: nginx:alpine
name: nginx
volumeMounts:
- mountPath: "/var/www/html"
name: mypd
volumes:
- name: mypd
persistentVolumeClaim:
claimName: local-pvc
dnsPolicy: ClusterFirst
restartPolicy: Always
status: {}
Create a new Storage Class called delayed-volume-sc that makes use of the below specs:
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
controlplane ~ ➜ cat sc.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: delayed-volume-sc
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
JQ¶
k get nodes -o json | jq -c 'paths'
Networking¶
What is the network interface configured for cluster connectivity on the controlplane node?
controlplane ~ ✖ k get no -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
controlplane Ready control-plane 9m2s v1.27.0 192.9.180.3 <none> Ubuntu 20.04.5 LTS 5.4.0-1106-gcp containerd://1.6.6 # get the IP for controlPlane
node01 Ready <none> 8m37s v1.27.0 192.9.180.6 <none> Ubuntu 20.04.5 LTS 5.4.0-1106-gcp containerd://1.6.6
controlplane ~ ➜ ip a | grep -B3 192.9.180.3 # get the interface from the IP
366: eth0@if367: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
link/ether 02:42:c0:09:b4:03 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 192.9.180.3/24 brd 192.9.180.255 scope global eth0
Get the IP of default gateway
controlplane ~ ➜ ip route show default
default via 172.25.0.1 dev eth1
What is the port the kube-scheduler is listening on in the controlplane node?
controlplane ~ ➜ netstat -lntp | grep scheduler
tcp 0 0 127.0.0.1:10259 0.0.0.0:* LISTEN 3586/kube-scheduler
Notice that ETCD is listening on two ports. Which of these have more client connections established?
controlplane ~ ➜ netstat -lntp | grep etcd
tcp 0 0 192.9.180.3:2379 0.0.0.0:* LISTEN 3600/etcd
tcp 0 0 127.0.0.1:2379 0.0.0.0:* LISTEN 3600/etcd
tcp 0 0 192.9.180.3:2380 0.0.0.0:* LISTEN 3600/etcd
tcp 0 0 127.0.0.1:2381 0.0.0.0:* LISTEN 3600/etcd
CNI¶
Networking is a central part of Kubernetes, but it can be challenging to understand exactly how it is expected to work. There are 4 distinct networking problems to address:
Highly-coupled container-to-container communications: this is solved by Pods and localhost communications.
Pod-to-Pod communications: this is the primary focus of this document.
Pod-to-Service communications: this is covered by Services.
External-to-Service communications: this is also covered by Services.
CNI (Container Network Interface), a Cloud Native Computing Foundation project, consists of a specification and libraries for writing plugins to configure network interfaces in Linux containers, along with a number of supported plugins. CNI concerns itself only with network connectivity of containers and removing allocated resources when the container is deleted. Because of this focus, CNI has a wide range of support and the specification is simple to implement.
Inspect the kubelet service and identify the container runtime endpoint value is set for Kubernetes.
controlplane ~ ➜ ps aux | grep kubelet | grep endpoint
root 4567 0.0 0.0 3848468 99736 ? Ssl 01:06 0:06 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --container-runtime-endpoint=unix:///var/run/containerd/containerd.sock --pod-infra-container-image=registry.k8s.io/pause:3.9
Identify which of the below plugins is not available in the list of available CNI plugins on this host?
# check the below path
controlplane ~ ➜ ls /opt/cni/bin/
bandwidth dhcp firewall host-device ipvlan macvlan ptp static vlan
bridge dummy flannel host-local loopback portmap sbr tuning vrf
What is the CNI plugin configured to be used on this kubernetes cluster?
Run the command: `ls /etc/cni/net.d/`` and identify the name of the plugin.
Kubeadm¶
kubeadm upgrade plan [version]
Lightning Lab¶
Upgrade the current version of kubernetes from `1.26.0` to `1.27.0` exactly using the kubeadm utility.
There is currently an issue with this lab which requires an extra step. This may be addressed in the near future. On controlplane node 1. Drain nodekubectl drain controlplane --ignore-daemonsets
apt-get update
apt-mark unhold kubeadm
apt-get install -y kubeadm=1.27.0-00
kubeadm upgrade plan
kubeadm upgrade apply v1.27.0
kubectl describe node controlplane | grep -A 3 taint
Taints: node-role.kubernetes.io/control-plane:NoSchedule
node.kubernetes.io/unschedulable:NoSchedule
kubectl taint node controlplane node-role.kubernetes.io/control-plane:NoSchedule-
kubectl taint node controlplane node.kubernetes.io/unschedulable:NoSchedule-
apt-mark unhold kubelet
apt-get install -y kubelet=1.27.0-00
systemctl daemon-reload
systemctl restart kubelet
kubectl uncordon controlplane
apt-mark unhold kubectl
apt-get install -y kubectl=1.27.0-00
apt-mark hold kubeadm kubelet kubectl
kubectl drain node01 --ignore-daemonsets
ssh node01
apt-get update
apt-mark unhold kubeadm
apt-get install -y kubeadm=1.27.0-00
kubeadm upgrade node
apt-mark unhold kubelet
apt-get install kubelet=1.27.0-00
systemctl daemon-reload
systemctl restart kubelet
apt-mark hold kubeadm kubelet
exit
kubectl uncordon node01
kubectl get pods -o wide | grep gold-nginx
Print the names of all deployments in the admin2406 namespace in the following format
This is a job for custom-columns
output of kubectl
kubectl -n admin2406 get deployment -o custom-columns=DEPLOYMENT:.metadata.name,CONTAINER_IMAGE:.spec.template.spec.containers[].image,READY_REPLICAS:.status.readyReplicas,NAMESPACE:.metadata.namespace --sort-by=.metadata.name > /opt/admin2406_data
A kubeconfig file called admin.kubeconfig has been created in /root/CKA. There is something wrong with the configuration. Troubleshoot and fix it
First, let's test this kubeconfig
kubectl get pods --kubeconfig /root/CKA/admin.kubeconfig
Notice the error message.
Now look at the default kubeconfig for the correct setting.
cat ~/.kube/config
Make the correction
vi /root/CKA/admin.kubeconfig
Test
kubectl get pods --kubeconfig /root/CKA/admin.kubeconfig
Create a new deployment called nginx-deploy, with image nginx:1.16 and 1 replica. Next upgrade the deployment to version 1.17 using rolling update.
kubectl create deployment nginx-deploy --image=nginx:1.16
kubectl set image deployment/nginx-deploy nginx=nginx:1.17 --record
You may ignore the deprecation warning.
A new deployment called alpha-mysql
has been deployed in the alpha namespace. However, the pods are not running. Troubleshoot and fix the issue
The deployment should make use of the persistent volume alpha-pv to be mounted at /var/lib/mysql and should use the environment variable MYSQL_ALLOW_EMPTY_PASSWORD=1 to make use of an empty root password.
Important: Do not alter the persistent volume.
Inspect the deployment to check the environment variable is set. Here I'm using yq
which is like jq
but for YAML to not have to view the entire deployment YAML, just the section beneath containers
in the deployment spec.
kubectl get deployment -n alpha alpha-mysql -o yaml | yq e .spec.template.spec.containers -
Find out why the deployment does not have minimum availability. We'll have to find out the name of the deployment's pod first, then describe the pod to see the error.
kubectl get pods -n alpha
kubectl describe pod -n alpha alpha-mysql-xxxxxxxx-xxxxx
We find that the requested PVC isn't present, so create it. First, examine the Persistent Volume to find the values for access modes, capacity (storage), and storage class name
kubectl get pv alpha-pv
Now use vi
to create a PVC manifest
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-alpha-pvc
namespace: alpha
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: slow
Take the backup of ETCD at the location /opt/etcd-backup.db
on the controlplane node
This question is a bit poorly worded. It requires us to make a backup of etcd and store the backup at the given location.
Know that the certificates we need for authentication of etcdctl
are located in /etc/kubernetes/pki/etcd
Get the certificates as shown below
controlplane ~ ➜ k -n kube-system get pod etcd-controlplane -o yaml | grep -i crt
- --cert-file=/etc/kubernetes/pki/etcd/server.crt
- --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
- --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
- --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
Get the command to take backup from docs
ETCDCTL_API='3' etcdctl snapshot save /opt/etcd-backup.db \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key
check if backup is taken using
controlplane ~ ➜ ETCDCTL_API=3 etcdctl --write-out=table snapshot status /opt/etcd-backup.db
+----------+----------+------------+------------+
| HASH | REVISION | TOTAL KEYS | TOTAL SIZE |
+----------+----------+------------+------------+
| bc0cb4cf | 5914 | 973 | 2.2 MB |
+----------+----------+------------+------------+
Whilst we could also use the argument --endpoints=127.0.0.1:2379
, it is not necessary here as we are on the controlplane node, same as etcd
itself. The default endpoint is the local host.
Create a pod called secret-1401 in the admin1401 namespace
using the busybox image
The container within the pod should be called secret-admin
and should sleep for 4800 seconds. The container should mount a read-only secret volume called secret-volume at the path /etc/secret-volume
. The secret being mounted has already been created for you and is called dotfile-secret
.
-
Use imperative command to get a starter manifest
kubectl run secret-1401 -n admin1401 --image busybox --dry-run=client -o yaml --command -- sleep 4800 > admin.yaml
-
Edit this manifest to add in the details for mounting the secret
vi admin.yaml
Add in the volume and volume mount sections seen below
apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: run: secret-1401 name: secret-1401 namespace: admin1401 spec: volumes: - name: secret-volume secret: secretName: dotfile-secret containers: - command: - sleep - "4800" image: busybox name: secret-admin volumeMounts: - name: secret-volume readOnly: true mountPath: /etc/secret-volume
-
And create the pod
kubectl create -f admin.yaml
Mock Exam¶
Deploy a pod named nginx-pod using the nginx:alpine image.
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: nginx-pod
name: nginx-pod
spec:
containers:
- image: nginx:alpine
name: nginx-pod
resources: {}
dnsPolicy: ClusterFirst
restartPolicy: Always
status: {}
Deploy a messaging pod using the redis:alpine
image with the labels set to tier=msg
Run below command which create a pod with labels:
kubectl run messaging --image=redis:alpine --labels=tier=msg
Create a namespace named apx-x9984574
Run below command to create a namespace:
kubectl create namespace apx-x9984574
Get the list of nodes in JSON format and store it in a file at /opt/outputs/nodes-z3444kd9.json
Use the below command which will redirect the o/p:
kubectl get nodes -o json > /opt/outputs/nodes-z3444kd9.json
Create a service messaging-service to expose the messaging application within the cluster on port 6379.
Execute below command which will expose the pod on port 6379:
kubectl expose pod messaging --port=6379 --name messaging-service
Create a deployment named hr-web-app using the image kodekloud/webapp-color
with 2 replicas.
In v1.19, we can add --replicas
flag with kubectl create deployment
command:
kubectl create deployment hr-web-app --image=kodekloud/webapp-color --replicas=2
Create a static pod named static-busybox on the controlplane node that uses the busybox image and the command sleep 1000
To Create a static pod, copy it to the static pods directory. In this case, it is /etc/kubernetes/manifests
. Apply below manifests:
k run static-busybox --image=busybox --command sleep 1000 --dry-run=client -o yaml > static-busybox.yaml
This will create the below manifest
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: static-busybox
name: static-busybox
spec:
containers:
- command:
- sleep
- "1000"
image: busybox
name: static-busybox
resources: {}
dnsPolicy: ClusterFirst
restartPolicy: Always
status: {}
Create a POD in the finance namespace named temp-bus with the image redis:alpine.
Run below command to create a pod in namespace finance
:
kubectl run temp-bus --image=redis:alpine -n finance
A new application orange is deployed. There is something wrong with it. Identify and fix the issue.
Run below command and troubleshoot step by step:
kubectl describe pod orange
Export the running pod using below command and correct the spelling of the command sleeeep
to sleep
kubectl edit pod orange # make changes and save
k replace --force -f temp_file.yaml
Expose the hr-web-app
as service hr-web-app-service
application on port 30082 on the nodes on the cluster.
Apply below manifests:
apiVersion: v1
kind: Service
metadata:
name: hr-web-app-service
spec:
type: NodePort
selector:
app: hr-web-app
ports:
- port: 8080
targetPort: 8080
nodePort: 30082
Use JSON PATH query to retrieve the osImages of all the nodes and store it in a file /opt/outputs/nodes_os_x43kj56.txt
Run the below command to redirect the o/p:
kubectl get nodes -o=jsonpath='{.items[0].status.nodeInfo.osImage}' > /opt/outputs/nodes_os_x43kj56.txt
Create a Persistent Volume with the given specification
Volume name: pv-analytics
Storage: 100Mi
Access mode: ReadWriteMany
Host path: /pv/data-analytics
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-analytics
spec:
capacity:
storage: 100Mi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
hostPath:
path: /pv/data-analytics
Create a Pod called redis-storage with image: redis:alpine
with a Volume of type emptyDir that lasts for the life of the Pod.
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: redis-storage
name: redis-storage
spec:
containers:
- image: redis:alpine
name: redis-storage
volumeMounts:
- mountPath: /data/redis
name: cache-volume
volumes:
- name: cache-volume
emptyDir:
sizeLimit: 500Mi
dnsPolicy: ClusterFirst
restartPolicy: Always
status: {}
Create a new pod called super-user-pod with image busybox:1.28. Allow the pod to be able to set system_time
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: super-user-pod
name: super-user-pod
spec:
containers:
- image: busybox:1.28
name: super-user-pod
resources: {}
command:
- sleep
- "4800"
securityContext:
capabilities:
add: ["SYS_TIME"]
dnsPolicy: ClusterFirst
restartPolicy: Always
status: {}
Create a new user called john. Grant him access to the cluster. John should have permission to create, list, get, update and delete pods in the development namespace . The private key exists in the location: /root/CKA/john.key
and csr at /root/CKA/john.csr
Important Note: As of kubernetes 1.19, the CertificateSigningRequest object expects a signerName.
Please refer the documentation to see an example. The documentation tab is available at the top right of terminal.
CSR: john-developer Status:Approved
Role Name: developer, namespace: development, Resource: Pods
Access: User 'john' has appropriate permissions
Form the CSR request
apiVersion: certificates.k8s.io/v1
kind: CertificateSigningRequest
metadata:
name: myuser
spec:
request: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURSBSRVFVRVNULS0tLS0KTUlJQ1ZEQ0NBVHdDQVFBd0R6RU5NQXNHQTFVRUF3d0VhbTlvYmpDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRApnZ0VQQURDQ0FRb0NnZ0VCQUxqR2J3T1NDUFZycHB5QnhDL0ZDSURoTE90TXJyN21nWUlDbFYvcDlkaHZjSXdqCno0ZmpJdTlITDdrZlpTT2kxT21NSDNtTUlMaDRLa3J2bnFaeENXUGFrVEVDN005T1lsNHoyWXlWWDZ5R3p0WEYKYXZqSUcrVUJ5Zmo0V2M5c0l6cEdJS0dqN3JaQmVZamV3STlpUU5yQzc2RFJpcStKZU1oRFhIT2ZtSm9oU0J3YgpsQm9rSEp5aVNITzM1OGx6WEs1UElZaTVqKy9waUFhSHRKbjg3Vzl1K2tpNzJsc3IxN0JoV0FMTzQrOHFDOUgvCjMzZ2VQNUxhMXJTanVjYVk1eE9IL2s2dVdabGVVUUVyeVBqUDg0TW1sUnhrZEVHdTJ6dmY5c2pmZUFWNE1QTkoKYXYxcTMrc0ZNbHB2VndGb2RIbFgzL2ZzK25abHFhYWp2eW5yc1hFQ0F3RUFBYUFBTUEwR0NTcUdTSWIzRFFFQgpDd1VBQTRJQkFRQWhrMVVrTklqSzhDZmx2QTB0bEpOWi83TlgvZUlMQzF6d2h1T0NpQm14R2dUaGZReDdqYWtICnNyMmdUSXlpU0RsdVdVKzVZeW1CeElhL0xHVmRackhpSlBLRzgyVlNmck9DUHgrME1Bbk5PNTZpWWNUZ2RXZ3IKanByaUJYbDdrVkV0UUZjVTVwSGt0aW92Nk5mb0htRzZqT2w5dzVNYzRNMDJGbUN1Yi9sSngrNThIQnI1ekZLQQp4bGRNaXZ5V05CTlY3S3p0a1FkWElsLzR0emllME11ekdxRkxZNWh6R3pDSnVwekd5bmZXc0hmd2JaeWVKTVlrCnlmWldTV0FRSHhEZk5HRWxvNXhja1FTOVBWU29NK0YyNFoveXA3ZEI1Mlc0bE1yYVRsa2VNTy9pU25hRU5tdGwKazhPTDNielhXYS82K0hkdnNremtGK2hpVHFoRW9XTEIKLS0tLS1FTkQgQ0VSVElGSUNBVEUgUkVRVUVTVC0tLS0tCg==
signerName: kubernetes.io/kube-apiserver-client
expirationSeconds: 86400 # one day
usages:
- client auth
# create the CSR
controlplane ~/CKA ➜ k apply -f csr-john.yaml
certificatesigningrequest.certificates.k8s.io/john created
View and approve it
controlplane ~/CKA ➜ k get csr
NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION
csr-q6l5t 45m kubernetes.io/kube-apiserver-client-kubelet system:bootstrap:rsxu7y <none> Approved,Issued
csr-tnnwr 45m kubernetes.io/kube-apiserver-client-kubelet system:node:controlplane <none> Approved,Issued
john 3m47s kubernetes.io/kube-apiserver-client kubernetes-admin 24h Pending
controlplane ~/CKA ➜ k certificate approve john
certificatesigningrequest.certificates.k8s.io/john approved
Create the role
controlplane ~/CKA ➜ kubectl create role developer --verb=get,list,create,update,delete --resource=pods -n development
role.rbac.authorization.k8s.io/developer created
Create a role binding
controlplane ~/CKA ➜ kubectl create rolebinding developer-rb --role=developer --user=john -n development
rolebinding.rbac.authorization.k8s.io/developer-rb created
Check if it worked
controlplane ~/CKA ✖ k auth can-i get pods -n development --as john
yes
Create a nginx pod called nginx-resolver
using image nginx, expose it internally with a service called nginx-resolver-service
. Test that you are able to look up the service and pod names from within the cluster. Use the image: busybox:1.28
for dns lookup. Record results in /root/CKA/nginx.svc
and /root/CKA/nginx.pod
Expose the pod after creating it
/CKA ✖ kubectl expose pod nginx-resolver --name=nginx-resolver-service --port=8080
service/nginx-resolver-service exposed
Create a test pod with sleep time
controlplane ~/CKA ➜ k run test --image=busybox:1.28 -- sleep 5000
pod/test created
run it to show the nslookup
controlplane ~/CKA ➜ k exec test -- nslookup nginx-resolver-service
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
Name: nginx-resolver-service
Address 1: 10.104.187.189 nginx-resolver-service.default.svc.cluster.local
Get the pod records
controlplane ~/CKA ➜ k exec test -- nslookup 10-244-192-4.default.pod.cluster.local
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
Name: 10-244-192-4.default.pod.cluster.local
Address 1: 10.244.192.4 10-244-192-4.nginx-resolver-service.default.svc.cluster.local
controlplane ~/CKA ➜ k exec test -- nslookup 10-244-192-4.default.pod.cluster.local > /root/CKA/nginx.pod
Create a static pod on node01 called nginx-critical with image nginx and make sure that it is recreated/restarted automatically in case of a failure.
controlplane ~ ➜ cat static-pod.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: nginx-critical
name: nginx-critical
spec:
containers:
- image: nginx
name: nginx-critical
resources: {}
dnsPolicy: ClusterFirst
restartPolicy: Always
status: {}
Create a new service account with the name pvviewer. Grant this Service account access to list all PersistentVolumes in the cluster by creating an appropriate cluster role called pvviewer-role
and ClusterRoleBinding called pvviewer-role-binding
. Next, create a pod called pvviewer with the image: redis and serviceAccount: pvviewer
in the default namespace
controlplane ~ ➜ k create sa pvviewer
controlplane ~ ➜ kubectl create clusterrole pvviewer-role --verb=list --resource=persistentvolumes
clusterrole.rbac.authorization.k8s.io/pvviewer-role created
Create role binding
controlplane ~ ➜ kubectl create clusterrolebinding pvviewer-role-binding --clusterrole=pvviewer-role --serviceaccount=default:pvviewerclusterrolebinding.rbac.authorization.k8s.io/
pvviewer-role-binding created
controlplane ~ ✖ cat pod-view.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: pvviewer
name: pvviewer
spec:
serviceAccountName: pvviewer
containers:
- image: redis
name: pvviewer
resources: {}
dnsPolicy: ClusterFirst
restartPolicy: Always
status: {}
List the InternalIP
of all nodes of the cluster. Save the result to a file /root/CKA/node_ips
Answer should be in the format: InternalIP of controlplane
kubectl get nodes -o=jsonpath='{.items[*].status.addresses[0].address}' > /root/CKA/node_ips
Create a pod called multi-pod with two containers. Container 1, name: alpha, image: nginx Container 2: name: beta, image: busybox, command: sleep 4800
Environment Variables: container 1: name: alpha
Container 2: name: beta
controlplane ~/CKA ➜ cat multi-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: multi-pod
spec:
containers:
- name: alpha
image: nginx
env:
- name: name
value: "alpha"
- name: beta
image: busybox
env:
- name: name
value: "beta"
command:
- sleep
- "4800"
Create a Pod called non-root-pod , image: redis:alpine, runAsUser: 1000 and fsGroup: 2000
apiVersion: v1
kind: Pod
metadata:
name: non-root-pod
spec:
securityContext:
runAsUser: 1000
fsGroup: 2000
containers:
- name: sec-ctx-demo
image: redis:alpine
We have deployed a new pod called np-test-1 and a service called np-test-service. Incoming connections to this service are not working. Troubleshoot and fix it. Create NetworkPolicy, by the name ingress-to-nptest that allows incoming connections to the service over port 80
controlplane ~/CKA ➜ k apply -f policy-pod.yaml
networkpolicy.networking.k8s.io/ingress-to-nptest created
controlplane ~/CKA ➜ cat policy-pod.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: ingress-to-nptest
spec:
podSelector:
matchLabels:
run: np-test-1
ingress:
- from:
- podSelector:
matchLabels:
run: np-test-1
ports:
- protocol: TCP
port: 80
policyTypes:
- Ingress
Taint the worker node node01 to be Unschedulable. Once done, create a pod called dev-redis, image redis:alpine, to ensure workloads are not scheduled to this worker node. Finally, create a new pod called prod-redis and image: redis:alpine with toleration to be scheduled on node01.
key: env_type, value: production, operator: Equal and effect: NoSchedule
controlplane ~/CKA ➜ kubectl taint nodes node01 env_type=production:NoSchedule
node/node01 tainted
Create the pod
controlplane ~/CKA ➜ cat tolerated-pod.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: pod-redis
name: prod-redis
spec:
containers:
- image: redis:alpine
name: pod-redis
tolerations:
- key: "env_type"
operator: "Equal"
value: "production"
effect: "NoSchedule"
dnsPolicy: ClusterFirst
restartPolicy: Always
status: {}
create pod and add 2 labels as shown below
controlplane ~/CKA ➜ k -n hr label pod hr-pod envionment=production
pod/hr-pod labeled
controlplane ~/CKA ➜ k -n hr get pod --show-labels
NAME READY STATUS RESTARTS AGE LABELS
hr-pod 1/1 Running 0 69s envionment=production,run=hr-pod,tier=frontend