Cluster Automatic Scaling

Last Updated：2020-10-27

Overview of Automatic Scaling

The running of CCE is based on the cluster formed by a set of cloud platform servers. The cluster provides the containers of users with necessary basic resources, such as CPU, memory and disk. Generally, the cluster scale is defined by users in creation of CCE services. During the CCE use process, the cluster can be also scaled up or down at any time. However, when the service growth rate of users exceeds beyond expectations or peak values fluctuate, the resources provided by the cluster may be Not enough to support the service requirements, causing the service running to slow down.

By enabling the automatic scaling feature of CCE, the cluster automatically creates nodes when the resources are not enough, and the extra nodes are automatically released when the resources are surplus, so as to guarantee that the cluster resources are always enough to support the service load, and meanwhile maximize the cost savings. When enabling the automatic scaling feature, the users can also set the maximum and minimum numbers of nodes for scaling, so as to ensure the scaling is made within the expected scope.

Interpretation of Concepts

Concept	Description
Scaling group	It refers to an assembly of nodes with the same configuration, and automatically scales up and down according to the machine configuration of the group.
Scaling group min	When the scaling group meets the conditions of scaling down, it is guaranteed that the number of nodes of the scaling group after scaling down is not smaller than this value.
Scaling group max	When the scaling group meets the conditions of scaling up, it is guaranteed that the number of nodes of the scaling group after scaling up is not higher than this value.
Threshold of scaling down	The cluster may trigger the automatically scaling when the usage rate of the node resources (cpu and mem) in the scaling group is lower than the set threshold.
Time delay of triggering of scaling down	The cluster may trigger the automatic scaling down when the usage rate of nodes resources is constantly lower than the threshold of scaling down within the configured time delay of triggering of scaling down.
Maximum number of concurrent scaling down	This value is an integral number, indicating the number of nodes whose usage rate of resources of concurrent scaling down is 0.
Start interval of scaling down after scaling up	This value is an integral number in minutes. After this interval, the evaluation on whether the scaled-up nodes can be scaled down starts.
Use of locally stored pod	In the scaling down, you can select to skip the nodes including the locally stored pod
pod under the kube-system name space	In the scaling down, you can select to skip the non-DaemonSet pod nodes under the kube-system name space
selection policy of scaling up for the multi-scaling group	random: Randomly select one scaling group from the scaling groups meeting the conditions of scaling up. least-waste: Select the scaling group with the minimum remaining resources while meeting the requirements of pod. most-pods: Select the scaling group which can schedule the most pods in the scaling up

Operation Guide

Configure Automatic Scaling of Clusters

1.Log in to the management console of cloud platform to enter "Product Service > Cloud Container Engine (CCE)". Click the "Cluster Management > Cluster List" in the left navigation bar to enter the page of cluster list. In the "More Operations" in the operation bar, click "Automatic Scaling".

2.Enter the page of automatic scaling of clusters. When first configuring a new cluster, you should click "Authorize" in the scaling group list to enable the automatic scaling feature.

3.After the authorization is enabled, click "Create a Scaling Group" to enter the page of creating a scaling group, and configure the range of nodes of automatic scaling and the nodes.

The minimum number of nodes of automatic scaling is greater than or equal 0, and the maximum number of nodes cannot exceed the limit of the current cluster nodes.
A created scaling group cannot directly create a node. The cluster automatically creates and releases nodes within the range of the maximum and minimum numbers of nodes according to the service condition.

Select the node configuration according to requirements:

The label information can be increased for the scaling group.

4.After the scaling group is created, click the "Edit" button at the top right corner of global configuration to modify the global configuration of automatic scaling.

In the pop-up global configuration box, you can switch on and off the automatic scaling, configure the threshold of scaling down, and time delay of scaling down, etc.

FAQs in Automatic Scaling

1.What are the conditions for triggering of the scaling up?

There are pending pods in the cluster due to the insufficient resources (cpu and memory).
The number of nodes in the scaling group doesn't reach the max value.

2.Why cannot the number of nodes in the scaling group be scaled down to 0 sometimes?
When the configuration is modified each time, the scaling assembly is deleted and restarted. When the scaling group assembly is scheduled to the node in the scaling group, the node in the scaling group is not scaled down. (Refer to Question 5)

3.Why cannot the machine which is just scaled up be scaled down although it meets conditions?
The machine which is just scaled up has a protection time of 10 minutes. We may consider scaling down the machine only after ten minutes.

4.Why does the modified configuration not become valid as soon as possible?
After the configuration(group configuration and configuration of scaling down) is modified, the nodes existing in the scaling group are identified as the nodes which are just scaled up. After another 10 minutes (the time is adjustable), the nodes can match the conditions of scaling down.

5.Why is the machine meeting the threshold and time of scaling down not scaled down?
First, judge whether the number of nodes in the group reaches the set min.
Can the pod on this machine be scheduled to other nodes? If it cannot be scheduled, this machine cannot be scaled down either.
Whether the node is set as non-scaled down ("cluster-autoscaler.kubernetes.io/scale-down-disabled": "True").
Whether this node is a node which is just scaled up (-scale-down-delay-after-add, set as 10 minutes by default, it means that the node which is just scaled up cannot be judged to be scaled down within 10 minutes).
Whether the scaling up of this group fails in the past 3 minutes (-scale-down-delay-after-failure, this parameter can be set).
Whether the two parameters of --scale-down-delay-after-delete (interval between two times of adjacent scaling down) and --scan-interval (scanning interval) are set.

6.How to view the status of the scaling assembly in the cluster?
View configMap ->>·kubectl get configmap cluster-autoscaler-status -n kube-system -o yaml.
View automatic scaling log ->>· kubectl logs $(kubectl get pods -n kube-system.| grep cluster-autoscaler| awk '{print $1}') -n kube-system -f

7.Why is usage rate displayed by the scaling assembly is different from that I calculate?
CCE reserves the machine resources to some extent. The number of CPU cores and the amount of memory calculated by the scaling assembly are the number of resources which can be allocated by the users.

8.How to schedule the pod to a designated scaling group?
When creating a scaling group, designate the label of the scaling group. All nodes in the scaling group have a label, and may schedule the pod by nodeSelector or node Affinity. 9.How to use the GPU scaling group?
Create a scaling group of GPU type. When creating the pod, designate requests, limits nvidia.com/gpu: 1 , as well as the GPU card number required by the pod. The card number here can only be an integral number.

Note

If the service is intolerant of interruptions, the use of automatic scaling is not recommended. The reason is that some Pods may be restarted on other nodes in the scaling down, and this may cause a short interruption.
Do not directly modify the nodes of the scaling group. The nodes in the same scaling group should have the same machine configuration (cpu and memory..), the same label, and the same system pod.
Judge whether the number of machine instances supported by your account conforms to the min/max of the set scaling group.
Designate the request field when creating the pod.
Limit the MEM and CPU resources used by the scaling group assembly. The scaling assembly requires the memory of 300m and CPU 0.1 Core by default. The resource limit is not set. To protect your machine resources, if your cluster scale is greater, it may be brought into the formula below to compute the quota of the scaling assembly. MEM = job_num*10KB + pod_num*25KB + 22MB + node_num * 200KB ; CPU = 0.5 Core ~ 1 Core the minimum requirement is computed here under the ideal condition. More CPU and MEM resources are required additionally when the cluster has a great deal pod pending, scaling up and scaling down.

Cluster Connection through KubeCtl

Resource Reservation