Migrating VPC CNI from Self-Managed to EKS Managed Add-on

Migrating VPC CNI from Self-Managed to EKS Managed Add-on

If you’ve been running Amazon EKS clusters for a while, chances are your VPC CNI plugin is self-managed. This was the default for clusters created before AWS introduced managed addons, and many teams have kept it that way ever since.


What is VPC CNI and Why It Matters

Amazon VPC CNI (Container Network Interface) is the networking backbone of every EKS cluster. It’s responsible for one of the most fundamental operations in Kubernetes — assigning IP addresses to pods. Without it, no pod can communicate with anything inside or outside your cluster.

It runs as a DaemonSet called aws-node on every single node in your cluster, managing Elastic Network Interfaces (ENIs) and IP address pools on each node. When a pod is created, VPC CNI is the component that reaches into that pool and hands it a VPC-native IP address.


Self-Managed vs Managed Addons

When VPC CNI is self-managed, you own everything — the manifest, the version, the upgrades. Every time a new version is released, you need to track the changelog, assess compatibility with your Kubernetes version, test the upgrade, and apply it manually. Miss a version and you’re running outdated code with unpatched bugs and security issues.

AWS managed addons shift that effort. AWS handles version compatibility, security patches, and validates each release against EKS. You still control when to upgrade, but the heavy lifting of tracking and validating releases is no longer your problem. You also get native integration with the EKS API, AWS Console, and CloudFormation.


Why We Decided to Migrate

The reason was straightforward — management flexibility.

With self-managed VPC CNI, every upgrade is a manual process. You track releases, read changelogs, assess compatibility with your Kubernetes version, update the manifest, and apply it yourself. Across multiple clusters this becomes significant toil over time and risky.

AWS managed addons give you a simpler operational model. Upgrades are triggered via the EKS API, AWS Console, or CLI with a single command. Version compatibility with your Kubernetes version is handled by AWS. You can also manage addon configuration explicitly through --configuration-values rather than manually editing DaemonSet manifests.


Pre-Migration Checklist

Before running a single command, there are several things you need to verify and prepare. Skipping these steps is how migrations go wrong.

1. Backup Your Current CNI Configuration

If anything goes wrong during or after migration, this file is your source of truth to restore the original state.

kubectl get daemonset aws-node -n kube-system -o yaml > aws-node-backup-$(date +%Y%m%d).yaml

2. Identify Your Custom Environment Variables

Not all VPC CNI deployments are identical. Many teams have customized environment variables that deviate from AWS defaults — things like MTU settings, warm pool targets, or log levels.

Check your current custom values:

kubectl get daemonset aws-node -n kube-system \
  -o jsonpath='{.spec.template.spec.containers[0].env}' | jq .

Pay close attention to variables like:

  • AWS_VPC_ENI_MTU — if you’re using jumbo frames (9001), this is critical to preserve
  • WARM_ENI_TARGET / WARM_IP_TARGET / MINIMUM_IP_TARGET — affects IP pool behavior
  • AWS_VPC_K8S_CNI_LOGLEVEL — logging configuration
  • ENABLE_PREFIX_DELEGATION — if already enabled, must be carried over

3. Verify Your OIDC Provider

AWS managed addons strongly prefer using IRSA (IAM Roles for Service Accounts) for permissions rather than relying on the node instance profile. Before setting up IRSA, confirm that an OIDC provider is already registered for your cluster.

The OIDC ID from the first command should appear in the second command’s output. If it doesn’t, you’ll need to create the OIDC provider before proceeding.

# Get your cluster's OIDC issuer
aws eks describe-cluster --name <cluster-name> \
  --query "cluster.identity.oidc.issuer" \
  --output text

# Verify it exists in IAM
aws iam list-open-id-connect-providers

4. Set Up IRSA for VPC CNI

By default, aws-node inherits permissions from the EC2 node IAM role. This works but it means the CNI plugin has access to everything the node role allows — more permissions than it actually needs. IRSA scopes the permissions down to exactly what VPC CNI requires.

Check if IRSA is already configured:

kubectl describe serviceaccount aws-node -n kube-system

If you see eks.amazonaws.com/role-arn in the annotations, you’re already set. If Annotations: <none>, create the IRSA role:

eksctl create iamserviceaccount \
  --name aws-node \
  --namespace kube-system \
  --cluster <cluster-name> \
  --role-name AmazonEKSVPCCNIRole \
  --attach-policy-arn arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy \
  --override-existing-serviceaccounts \
  --approve \
  --region <region>

The --override-existing-serviceaccounts flag is safe here — it only adds the IRSA annotation to the existing service account without recreating it or affecting running pods.

5. Understand OVERWRITE vs PRESERVE

When creating a managed addon on a cluster that already has a self-managed version running, you must use --resolve-conflicts OVERWRITE. This is not optional — using PRESERVE during create-addon behaves the same as NONE, meaning the migration can fail if any configuration conflicts are detected.

However, OVERWRITE alone would reset all your custom environment variables back to AWS defaults. This is where --configuration-values comes in. By combining OVERWRITE with your explicit configuration values in the same command, the migration succeeds and your custom settings are applied simultaneously.

An important distinction to keep in mind:

  • OVERWRITE is for create-addon (migration)
  • PRESERVE is for update-addon (after migration, to protect your config during version upgrades)

Once you’re on managed addon, always use PRESERVE for future updates.


Step by Step Migration Walkthrough

With the checklist complete, we’re ready to migrate. This walkthrough covers the exact steps to move from self-managed to EKS managed VPC CNI with zero downtime.

Phase 1: Pre-Flight Checks

First, confirm the addon is not already managed and identify the correct version to use:

# Confirm self-managed — should return ResourceNotFoundException
aws eks describe-addon \
  --cluster-name <cluster-name> \
  --addon-name vpc-cni

# Check your current CNI version
kubectl describe daemonset aws-node --namespace kube-system \
  | grep amazon-k8s-cni: | cut -d : -f 3

# List available managed addon versions for your Kubernetes version
aws eks describe-addon-versions \
  --kubernetes-version <k8s-version> \
  --addon-name vpc-cni \
  --query 'addons[].addonVersions[].addonVersion' \
  --output table

Match the managed addon version to your currently running CNI version. For example, if you’re running v1.19.6, pick v1.19.6-eksbuild.7 — the latest eksbuild for that version. This ensures no CNI version change happens during the migration itself, keeping the operation purely a management model change.

Phase 2: IRSA Setup

If you completed the IRSA setup in the pre-migration checklist, verify the annotation is still in place:

kubectl describe serviceaccount aws-node -n kube-system

Expected output:

Annotations: eks.amazonaws.com/role-arn: arn:aws:iam::<account-id>:role/AmazonEKSVPCCNIRole

Phase 3: The Migration Command

This is the single command that performs the migration. Every flag matters:

aws eks create-addon \
  --cluster-name <cluster-name> \
  --addon-name vpc-cni \
  --addon-version <version> \
  --resolve-conflicts OVERWRITE \
  --service-account-role-arn arn:aws:iam::<account-id>:role/AmazonEKSVPCCNIRole \
  --configuration-values '{"env":{"AWS_VPC_ENI_MTU":"9001","WARM_ENI_TARGET":"1","WARM_PREFIX_TARGET":"1","AWS_VPC_K8S_CNI_LOGLEVEL":"DEBUG","AWS_VPC_K8S_PLUGIN_LOG_LEVEL":"DEBUG"}}' \
  --region <region>

Here’s what each flag does:

  • --addon-version — pins the managed addon to your currently running CNI version, avoiding any unintended version jump during migration
  • --resolve-conflicts OVERWRITE — required for migrating from self-managed. Allows EKS to take ownership of the existing DaemonSet
  • --service-account-role-arn — attaches the IRSA role so CNI uses scoped permissions instead of the node instance profile
  • --configuration-values — explicitly sets your custom env vars so they survive the OVERWRITE and are locked into the managed addon config from day one

When the command succeeds you’ll immediately see the addon in CREATING status with your configuration values confirmed in the response:

{
    "addon": {
        "addonName": "vpc-cni",
        "status": "CREATING",
        "addonVersion": "v1.19.6-eksbuild.7",
        "configurationValues": "{\"env\":{\"AWS_VPC_ENI_MTU\":\"9001\",\"WARM_ENI_TARGET\":\"1\"...}}"
    }
}

Phase 4: Monitor the Migration

Open two terminals and watch both the addon status and pods simultaneously:

Terminal 1 — Addon status

watch -n 5 'aws eks describe-addon \
  --cluster-name <cluster-name> \
  --addon-name vpc-cni \
  --query "addon.{Status:status,Version:addonVersion,Health:health}"'

Terminal 2 — Pod watch

kubectl get pods -n kube-system -l k8s-app=aws-node -w

The addon status should transition from CREATING to ACTIVE within a minute or two.

Phase 5: Post-Migration Verification

Once the addon reaches ACTIVE, run these checks to confirm everything is intact.

Verify the addon is active and healthy:

aws eks describe-addon \
  --cluster-name <cluster-name> \
  --addon-name vpc-cni \
  --query "addon.{Status:status,Version:addonVersion,Health:health}"

Confirm your custom env vars survived:

kubectl get daemonset aws-node -n kube-system \
  -o jsonpath='{.spec.template.spec.containers[0].env}' | jq .

Check that AWS_VPC_ENI_MTU, WARM_ENI_TARGET, WARM_PREFIX_TARGET and your log levels are all present with the correct values.

Test pod networking:

kubectl run test-pod --image=busybox --rm -it --restart=Never \
  -- nslookup kubernetes.default.svc.cluster.local

A successful DNS resolution confirms pod networking is fully functional after the migration.

What Actually Happens Under the Hood

When you run create-addon on a cluster with an existing self-managed VPC CNI, EKS adopts the existing aws-node DaemonSet and applies your --configuration-values to it. This triggers a rolling update of the aws-node DaemonSet across your nodes.

  • Nodes whose aws-node pod is being restarted temporarily rely on the CNI binary already present on the host for existing pods
  • New pod scheduling on those nodes is briefly paused until the aws-node pod comes back up
  • Existing running pods on those nodes are not affected

This is what makes the migration safe to run on a live production cluster without a maintenance window.


Conclusion

Migrating VPC CNI from self-managed to EKS managed addon is an operation that sounds risky on paper — touching the networking layer of a live production cluster with no maintenance window. In practice, when done correctly, it is a clean and safe operation.

Operationally, we moved from a manually managed DaemonSet to an AWS managed addon. Version upgrades that previously required tracking GitHub releases, reading changelogs, and applying manifests manually are now a single API call. Configuration is explicitly declared through --configuration-values rather than scattered across manually edited manifests.

Security-wise, we replaced broad node instance profile permissions with a scoped IRSA role — VPC CNI now has exactly the permissions it needs and nothing more.

Technically, the migration completed with zero workload disruption. Existing pods continued running throughout. The rolling update of aws-node pods was controlled by the maxUnavailable: 10% update strategy, and new pod scheduling resumed on each node as soon as its aws-node pod came back up.

The key lessons from this migration:

  • Always match the managed addon version to your currently running CNI version — the migration should be a management model change, not a version change
  • OVERWRITE is required for create-addon, PRESERVE is for update-addon — these are not interchangeable
  • Combine OVERWRITE with --configuration-values in the same command to protect your custom settings
  • Community articles are helpful starting points but always cross-reference with official AWS documentation — the OVERWRITE vs PRESERVE behavior is frequently misrepresented

Official Documentation