How to Continue Dilatation When Container Network Segment Space Is Exhausted (VPC Network Mode)

Updated at：2025-10-27

Note: The following only applies to clusters using "VPC Route" mode

Overview

The maximum count of nodes in a cluster is determined by the size of the container network segment and the maximum pod count per node. For example:

With container network segment 172.16.0.0/16 and maximum 256 pods per node, a cluster can have at most 256 nodes
With container network segment 192.168.0.0/22 and maximum 128 pods per node, a cluster can have at most 8 nodes;

In some situations, if the container network segment selected during cluster creation is too small, or the maximum pod count per node is set too high, expanding the cluster may exceed the maximum allowable node count. If the kube-controller-manager cannot allocate a pod network segment for the new nodes, those nodes may remain in the notReady state.

Note: Before March 11, 2021, the controller manager was deployed in binary mode, adhering to the older solution. For clusters created after March 11, 2021, the controller manager is deployed as a static pod, following the updated solution.

Solution for clusters of master deployed in binary

Before March 11, 2021, the controller manager was deployed in binary.

Currently, only the V1 architecture of the container network is supported, though it is possible to manually upgrade to the V2 architecture. To check the current container network version, verify if the cluster contains the cce-cni-node-agent object. If present, the version in use is V1.

Shell

1$ kubectl -n kube-system get cm cce-cni-node-agent
2NAME                 DATA   AGE
3cce-cni-node-agent   1      125d

Step 1

Modify the configuration of the cce-network-operator on the cluster's master node. The field to adjust is cluster-pool-ipv4-mask-size. This modification increases the cluster's capacity to accommodate more nodes by reducing the maximum pod count per node, which involves increasing the value of cluster-pool-ipv4-mask-size.

For clusters with multiple master replicas, each master node’s configuration must be updated individually.

Note: Do not modify cluster-pool-ipv4-mask-size to a smaller value than the current one, as this will cause network failure due to network segment conflicts

Step 2

There are two methods to remove a node from the cluster and rejoin it:

From the CCE product console interface, select "Remove Node" or "Delete Node," and then "Move into Node" or "Add a Node.\
Execute kubectl delete node <nodeName> to remove node from the K8s cluster. Execute kubectl get pods --all-namespaces=true -o wide | grep <nodeName> to ensure that there is no pod on the node. Then, reboot the kubelet on the node to rejoin it to the K8s cluster.

Note: Regardless of the method used to remove a node from the cluster, all pods on the removed node will be rescheduled. For nodes hosting online services, proceed with caution.

Practical case

Problem scenario

The current cluster has a container network segment of 172.26.0.0/22, with the kube-controller-manager configured as --node-cidr-mask-size=24. This means the cluster can support up to 4 nodes, each with a maximum pod capacity of 256.

The cluster already has 4 nodes. Further expansion will render additional nodes unusable.

Plain Text

1[root@instance-rhkiutp6-3 ~]# kubectl get node
2NAME       STATUS   ROLES    AGE    VERSION
310.0.5.3   Ready    <none>   119m   v1.13.10
410.0.5.4   Ready    <none>   117m   v1.13.10
510.0.5.5   Ready    <none>   20m    v1.13.10
610.0.5.6   Ready    <none>   118m   v1.13.10
7[root@instance-rhkiutp6-3 ~]# kubectl describe node | grep -i podcidr
8PodCIDR:                     172.26.2.0/24
9PodCIDR:                     172.26.1.0/24
10PodCIDR:                     172.26.3.0/24
11PodCIDR:                     172.26.0.0/24

Modification steps

Step 1

On the master, execute vim /etc/systemd/system/kube-controller.service to view the kube-controller-manager configuration:

Plain Text

1[Unit]
2Description=Kubernetes Controller Manager
3After=network.target
4After=kube-apiserver.service
5[Service]
6ExecStart=/opt/kube/bin/kube-controller-manager \
7--allocate-node-cidrs=true \
8--cloud-config=/etc/kubernetes/cloud.config \
9--cluster-cidr=172.26.0.0/22 \
10 --node-cidr-mask-size=24 \   #Modify here
11.......
12--kubeconfig=/etc/kubernetes/controller-manager.conf \
13--leader-elect=true \
14--logtostderr=true \
15--master=https://100.64.230.195:6443 \
16--v=6
17Restart=always
18Type=simple
19LimitNOFILE=65536
20[Install]
21WantedBy=multi-user.target

Change the --node-cidr-mask-size value from 24 to 26. After this adjustment, the cluster will be able to accommodate up to 16 nodes, though the maximum pod count per node will decrease to 64.

After modifying the configuration on each master node, execute the following command to restart the kube-controller-manager.

Plain Text

1systemctl daemon-reload
2systemctl restart kube-controller.service

Step 2

Execute kubectl delete node 10.0.5.4, and the 10.0.5.4 node will no longer appear in the cluster status:

Plain Text

1[root@instance-rhkiutp6-3 ~]# kubectl get node
2NAME       STATUS   ROLES    AGE    VERSION
310.0.5.3   Ready    <none>   132m   v1.13.10
410.0.5.5   Ready    <none>   33m    v1.13.10
510.0.5.6   Ready    <none>   132m   v1.13.10
6[root@instance-rhkiutp6-3 ~]# kubectl describe node | grep -i podcidr
7PodCIDR:                     172.26.2.0/24
8PodCIDR:                     172.26.3.0/24
9PodCIDR:                     172.26.0.0/24

Execute kubectl get pods --all-namespaces=true -o wide | grep <nodeName> to ensure no pod on 10.0.5.4.

On node 10.0.5.4, execute systemctl restart kubelet.service to reboot kubelet. The cluster status then indicates that node 10.0.5.4 is rejoined again, and the container network segment is changed to 172.26.1.0/26:

Plain Text

1[root@instance-rhkiutp6-3 ~]# kubectl get node
2NAME       STATUS   ROLES    AGE     VERSION
310.0.5.3   Ready    <none>   138m    v1.13.10
410.0.5.4   Ready    <none>   3m55s   v1.13.10
510.0.5.5   Ready    <none>   40m     v1.13.10
610.0.5.6   Ready    <none>   138m    v1.13.10
7[root@instance-rhkiutp6-3 ~]# kubectl describe node | grep -i podcidr
8PodCIDR:                     172.26.2.0/24
9PodCIDR:                     172.26.1.0/26
10PodCIDR:                     172.26.3.0/24
11PodCIDR:                     172.26.0.0/24

Each time an existing node is moved in or removed, it will create expansion space for 3 nodes. For example, 3 additional nodes can now be expanded for the cluster, and the allocated PodCIDRs will be 172.26.1.64/26, 172.26.1.128/26, and 172.26.1.192/26.

Users should continue following the outlined steps to add or remove nodes, creating additional space for expansion as needed.

Note: While nodes with differing PodCIDR masks can theoretically coexist in the same cluster, it is recommended to add or remove all nodes at once to ensure they share the same PodCIDR mask.

Solution for master using static pod deployment method

After March 11, 2021, the controller manager has utilized static pod deployment. Pod updates can be performed by modifying the relevant configuration file.

The container network architecture is classified into V1 and V2 versions. Confirm the version in use by checking if the cluster contains the cce-cni-node-agent object. If it is present, the version in use is V1.

Shell

1$ kubectl -n kube-system get cm cce-cni-node-agent
2NAME                 DATA   AGE
3cce-cni-node-agent   1      125d

Step 1: Modify configuration

Modify the configuration of kube-controller-manageron the master node in the cluster, and the field to be modified is --node-cidr-mask-size.

Since the goal of this modification is to allow the cluster to accommodate more nodes by reducing the maximum pod count per node, it is necessary to increase the --node-cidr-mask-size value.

For clusters with multiple master replicas, each master node’s configuration must be updated individually.

｜ Note: Do not modify node-cidr-mask-size to a smaller value than the current one, as it will cause network failure due to network segment conflicts

Step 2: Modify the network component

The container network supports both V1 and V2 architectures, which require different modification procedures. Confirm the version in use by checking if the cluster contains the cce-cni-node-agent object. If it is present, the version in use is V1.

Shell

1$ kubectl -n kube-system get cm cce-cni-node-agent
2NAME                 DATA   AGE
3cce-cni-node-agent   1      125d

For the V1 architecture of the container network, V1 directly utilizes the podcidr in the node spec without the need for any modifications.

For container network V2 architecture, it is required to modify cce-network-v2-config, and the corresponding field is cluster-pool-ipv4-mask-size. Then reboot the corresponding components cce-network-operator and cce-network-agent to ensure the configuration takes effect.

Step 3: Remove and move in the node again

There are two methods to remove a node from the cluster and rejoin it:

From the CCE product console interface, select "Remove Node" or "Delete Node," and then "Move into Node" or "Add a Node.\
Execute kubectl delete node <nodeName> to remove node from the K8s cluster. Execute kubectl get pods --all-namespaces=true -o wide | grep <nodeName> to ensure that there is no pod on the node. Then, reboot the kubelet on the node to rejoin it to the K8s cluster. Note: Regardless of the method used to remove a node from the cluster, all pods on the removed node will drift. For nodes hosting online services, operate with caution.

Practical case

Problem scenario

Currently, a cluster has a container network segment of 172.16.0.0/24, with kube-controller-manager configuration set to --node-cidr-mask-size=24. That is, the cluster can accommodate up to 256 nodes, with a maximum pod count of 256 per node. It is now required to increase node capacity to 1,024.

                Shell
                
            

                [root@root ~]# kubectl get no
NAME          STATUS   ROLES    AGE   VERSION
192.168.1.4   Ready    <none>   42m   v1.24.4
192.168.1.5   Ready    <none>   35m   v1.24.4
root@root:~# kubectl describe node | grep -i podcidr
root@root:~# kubectl describe node | grep -i podcidr
PodCIDR:                      10.0.1.0/24
PodCIDRs:                     10.0.1.0/24
PodCIDR:                      10.0.0.0/24
PodCIDRs:                     10.0.0.0/24
            

Modification steps

Step 1: Modify configuration

Check cnode-cidr-mask-size configuration:

                Shell
                
                [root@root manifests]# kubectl get po kube-controller-manager-192.168.1.4 -n kube-system -o yaml | grep node-cidr-mask-size
    - --node-cidr-mask-size=24

Change the cluster-pool-ipv4-mask-size value from 24 to 26. After the update, the cluster can support up to 1024 nodes, but the maximum number of pods per node will be reduced to 64.

Modify the kube-controller-manager parameter node-cidr-mask-size=26:

                Shell
                
            

                vim /etc/kubernetes/manifests/kube-controller-manager.yaml
apiVersion: v1
kind: Pod
metadata:
  annotations:
    scheduler.alpha.kubernetes.io/critical-pod: ""
  creationTimestamp: null
  labels:
    component: kube-controller-manager
    tier: control-plane
  name: kube-controller-manager
  namespace: kube-system
spec:
  containers:
    - command:
        - kube-controller-manager
        - --cluster-cidr=172.16.0.0/16
        - --feature-gates=MixedProtocolLBService=true
        - --master=https://192.168.1.4:6443
        - --node-cidr-mask-size=26
        ……
            

Here, kube-controller-manager is deployed statically. For static pods, kubelet will monitor changes to the definition files. After saving and closing the editor, kubelet will detect the file changes and automatically delete old pods and start new pods based on the updated definition file.

Step 2: Modify the network component

The V2 network architecture used here requires modifying the cluster-pool-ipv4-mask-size field in cce-network-v2-config.

                Shell
                
            

                kubectl edit cm cce-network-v2-config -n kube-system
apiVersion: v1
data:
  cced: |
    annotate-k8s-node: true
    api-rate-limit:
      bcecloud/apis/v1/AttachENI: rate-limit:5/1s,rate-burst:5,max-wait-duration:30s,parallel-requests:5,log:true
      bcecloud/apis/v1/BatchAddPrivateIP: rate-limit:5/1s,rate-burst:10,max-wait-duration:15s,parallel-requests:5,log:true
      bcecloud/apis/v1/BatchDeletePrivateIP: rate-limit:5/1s,rate-burst:10,max-wait-duration:15s,parallel-requests:5,log:true
      bcecloud/apis/v1/CreateENI: rate-limit:5/1s,rate-burst:5,max-wait-duration:30s,parallel-requests:5,log:true
      bcecloud/apis/v1/DescribeSubnet: rate-limit:5/1s,rate-burst:5,max-wait-duration:30s,parallel-requests:5
      bcecloud/apis/v1/StatENI: rate-limit:10/1s,rate-burst:15,max-wait-duration:30s,parallel-requests:10
    auto-create-network-resource-set-resource: true
    bbcEndpoint: bbc.gz.baidubce.com
    bccEndpoint: bcc.gz.baidubce.com
    bce-cloud-access-key: ""
    bce-cloud-country: cn
    bce-cloud-host: cce-gateway.gz.baidubce.com
    bce-cloud-region: gz
    bce-cloud-secure-key: ""
    bce-cloud-vpc-id: vpc-2f5wibbx4js7
    cce-cluster-id: cce-clboj6fa
    cce-endpoint-gc-interval: 30s
    cluster-pool-ipv4-cidr:
    - 172.16.0.0/16
    cluster-pool-ipv4-mask-size: 26
            

Reboot the corresponding components cce-network-operator and cce-network-agent to ensure the configuration takes effect.

Shell

1kubectl rollout restart deployment cce-network-operator -n kube-system
2kubectl rollout restart daemonset cce-network-agent -n kube-system

Step 3: Remove and move in the node again

Execute the kubectl delete command to delete the corresponding node.

Shell

1kubectl delete node 192.168.1.4

Check the cluster status to confirm the 192.168.1.4 node is no longer present:

                Shell
                
            

                [root@root ~]# kubectl get no
NAME          STATUS   ROLES    AGE   VERSION
192.168.1.5   Ready    <none>   47m   v1.24.4
root@root:~# kubectl describe node | grep -i podcidr
PodCIDR:                      10.0.1.0/24
PodCIDRs:                     10.0.1.0/24
            

Make sure there are no pods running on 192.168.1.4. If any pods remain, rejoining the node will not alter the pod CIDR. Use the following command to check.

                Shell
                
                kubectl get pods --all-namespaces=true -o wide | grep <nodeName>

On node 192.168.1.2, execute systemctl restart kubelet.service to reboot kubelet. The cluster status then indicates that node 192.168.1.2 is rejoined again, and the container network segment is changed to 172.26.1.0/26:

                Shell
                
            

                [root@root ~]# kubectl get node
NAME          STATUS   ROLES    AGE    VERSION
192.168.1.4   Ready    <none>   11m    v1.24.4
192.168.1.5   Ready    <none>   102m   v1.24.4
[root@root-3 ~]# kubectl describe node | grep -i podcidr
PodCIDR:                      10.0.0.0/26
PodCIDRs:                     10.0.0.0/26
PodCIDR:                      10.0.1.0/24
PodCIDRs:                     10.0.1.0/24
            

Users should continue following the steps outlined above to add or remove nodes as needed to create additional expansion capacity.

Note: While nodes with differing PodCIDR masks can theoretically coexist in the same cluster, it is recommended to add or remove all nodes at once to ensure they share the same PodCIDR mask.

Creating VPC-ENI Mode Cluster

Using NetworkPolicy in CCE Clusters

CCE CCE

CCE CCE

How to Continue Dilatation When Container Network Segment Space Is Exhausted (VPC Network Mode)

Overview

Solution for clusters of master deployed in binary

Step 1

Step 2

Practical case

Problem scenario

Modification steps

Step 1

Step 2

Solution for master using static pod deployment method

Step 1: Modify configuration

Step 2: Modify the network component

Step 3: Remove and move in the node again

Practical case

Problem scenario

Modification steps

Step 1: Modify configuration

Step 2: Modify the network component

Step 3: Remove and move in the node again