Development Notes

EKS Cluster Upgrade Remarks

Links

These links should be checked before the upgrade.

AWS EKS Upgrade Notes (Extended Support)

AWS EKS Upgrade Notes (Standart Support)

Kubernetes Deprecated API Migration Guide

Commands

**#List if there is a pod & container annotated with deprececated seccomp annotations.**
kubectl get pods --all-namespaces -o json | grep -E 'seccomp.security.alpha.kubernetes.io/pod|container.seccomp.security.alpha.kubernetes.io'

**#Check VPC CNI Plugin version, It matters while upgrading to 1.26**
kubectl get daemonset aws-node -n kube-system -o jsonpath='{.spec.template.spec.containers[0].image}'

**#While migrating from Pod Security Group to Pod Security Admission use this
#mapping to annotate namespaces.**
# MODE must be one of `enforce`, `audit`, or `warn`.
# LEVEL must be one of `privileged`, `baseline`, or `restricted`.
pod-security.kubernetes.io/<MODE>: <LEVEL>

# MODE must be one of `enforce`, `audit`, or `warn`.
# VERSION must be a valid Kubernetes minor version, or `latest`.
pod-security.kubernetes.io/<MODE>-version: <VERSION>

#Example labeling command
# Label 'default' namespace with audit mode set to 'baseline'
kubectl label namespace default \
  pod-security.kubernetes.io/audit=baseline \
  pod-security.kubernetes.io/audit-version=latest \ 
  --overwrite
  
 #Check VPC CNI Version
 kubectl describe daemonset aws-node --namespace kube-system | grep amazon-k8s-cni: | cut -d : -f 3
 
 #Check CoreDNS Version
 kubectl describe deployment coredns -n kube-system | grep Image | cut -d ":" -f 3

#Check Cluster Autoscaler Version
kubectl get deployment clusterautoscaler-aws-cluster-autoscaler -n kube-system -o=jsonpath='{.spec.template.spec.containers[0].image}'

Tools

Kubent

  • Easily check your deprecated apis, If your cluster is heavily loaded with helm charts It could throw an error. Check with kubectl api versions command kubectl api-versions

Pluto

  • Pluto is a utility to help users find deprecated Kubernetes apiVersions in their code repositories and their helm releases. Use this command pluto detect-all-in-cluster

Migrate Workers to New AMI’s

If your workers self managed, you need to update your launch template then you need to update ASG. Then you need to manually cordon & drain your nodes because it is self managed eventually. (Not recommended)

If your workers managed via node pools, you are free to update your workers by just update button. It will cordon & drain nodes automatically.

  • Be carefull all of your worker groups (each nodepool and each seperate self managed workers) follows the control plane version. Otherwise It will throw an error If you try to upgrade your cluster version.

Self Managed Addons

Common add-ons

Amazon VPC CNI

https://docs.aws.amazon.com/eks/latest/userguide/managing-vpc-cni.html

Update to desired version → Supported up to 1.31

https://github.com/aws/amazon-vpc-cni-k8s/releases/tag/v1.19.0

KubeProxy

https://docs.aws.amazon.com/eks/latest/userguide/managing-kube-proxy.html

Manually check via your deployment & daemon set image version. After 1.25 you must use minimal eks build image.

CoreDNS

https://docs.aws.amazon.com/eks/latest/userguide/coredns-add-on-self-managed-update.html

If you’re updating to CoreDNS 1.8.3 or later, then you need to add the endpointslices permission to the system:coredns Kubernetes clusterrole.

Update coredns with below command, change your 602401143452 and version number to based on your cluster configuration.

#Get configs 
**kubectl describe deployment coredns -n kube-system | grep Image**

#Update Image
**kubectl set image deployment.apps/coredns -n kube-system  coredns=602401143452.dkr.ecr.eu-central-1.amazonaws.com/eks/coredns:v1.9.3-eksbuild.21**

Cluster Autoscaler

https://github.com/kubernetes/autoscaler/releases

To upgrade the version of Kubernetes Cluster Autoscaler, change the version of the image in the deployment.

#Check version
**kubectl get deployment clusterautoscaler-aws-cluster-autoscaler -n kube-system -o=jsonpath='{.spec.template.spec.containers[0].image}'**

Notes

  • Lower versions up to 1.26, HPA’s should be migrated to new api endpoint (v2). You may not get any error for not upgrading HPA, but clusters scaling ability can interrupt.
  • In 1.27 release, —container-runtime argument will be ignored by kubelet. Make sure you are running with containerd engine. You must remove the --container-runtime argument from all of your node creation workflows and build scripts.
  • The alpha seccomp annotations was deprecated in 1.19, and with their removal in 1.27seccomp fields will no longer auto-populate for Pods with seccomp annotations. Instead, use the securityContext.seccompProfile field for Pods or containers to configure seccomp profiles.
  • Kubernetes version 1.25 contains changes that alter the behavior of an existing feature known as API Priority and Fairness (APF). It may effect if your cluster is customized with different api response priority under high load.
  • The AWS Load Balancer Controller v2.4.6 and earlier used the v1beta1 endpoint to communicate with EndpointSlices. If you’re using the EndpointSlices configuration for the AWS Load Balancer Controller, you must upgrade to AWS Load Balancer Controller v2.4.7 before upgrading your Amazon EKS cluster to 1.25.
  • In Kubernetes v1.28, the supported skew between core node components (kubelet and kube-proxy) and control plane components (kube-apiserver, kube-scheduler, kube-controller-manager, cloud-controller-manager) has been expanded from n-2 to n-3. This means that nodes can now be three minor versions behind the control plane, providing greater flexibility in upgrade processes.
  • The PersistentVolume (PV) controller now automatically assigns a default StorageClass to any unbound PersistentVolumeClaim (PVC) where the storageClassName is not specified.
  • The LegacyServiceAccountTokenCleanUp feature in Kubernetes 1.29 is designed to enhance security by reducing the potential attack surface. It targets legacy auto-generated secret-based service account tokens that have not been used for an extended period. Tokens are labeled as invalid if they haven’t been used for 1 year (default). After being marked invalid, tokens are automatically removed if not attempted to be used for an additional 1 year (default).