Setting of Automatic Horizontal Scaling of Working Load

Last Updated：2020-10-12

Introduction to HPA (Horizontal Pod Autoscaling)

HPA can automatically scale the number of pods in a deployment or replica set according to CPU utilization or metrics provided by some applications, which can better deal with scenarios such as burst traffic.

This article introduces how to use HPA to automatically expand and shrink applications in Cloud Container Engine (CCE).

Verify HPA

To demonstrate HPA, this article uses a custom Docker image based on php-apache image. The image includes an index.php page, which contains some code that can run CPU intensive computing tasks. The example Dockerfile is as follows:

FROM php:5-apache 

ADD index.php /var/www/html/index.php 

RUN chmod a+rx index.php

The example index.php file is as follows:

<?php 
$x = 0.0001; 
for ($i = 0; $i<= 1000000; $i++) { 
	 $x += sqrt($x); 
} 
echo "OK!"; 
?>

Precondition

1.A Kubernetes cluster has been deployed through CCE.

2.Kubectl has been configured to access the Kubernetes cluster locally.

1.Deployment and Operation

First, deploy a Deployment to run the above Docker image and expose it as a Kubernetes service.

kubectl run php-apache --image=hpa-example --requests=cpu=200m --expose --port=80 

service "php-apache" created 
deployment "php-apache" created

2.Create HPA

kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10 
deployment "php-apache" autoscaled 


After creation, the current monitoring value is unknown. After step 3, wait for 1-2 minutes, and then it will become normal percentage 
[root@instance-2tpjy37t ~]# kubectl get hpa 
NAME         REFERENCE               TARGETS         MINPODS   MAXPODS   REPLICAS   AGE 
php-apache   Deployment/php-apache  <unknown>/50%   1         10         1          5s

The corresponding YAML is as follows:

apiVersion: autoscaling/v2alpha1 
kind: HorizontalPodAutoscaler 
metadata: 
  name: php-apache 
  namespace: default 
spec: 
  scaleTargetRef: 
    apiVersion: apps/v1beta1 
    kind: Deployment 
    name: php-apache 
  minReplicas: 1 
  maxReplicas: 10 
  metrics: 
  - type: Resource 
    resource: 
      name: cpu 
      targetAverageUtilization: 50

Field Explanation:

scaleTargetRef : target object of HPA autoscaling
minReplicas : Minimum Pod quantity
maxReplicas : Maximum number of pods allowed
metrics : Metrics
targetAverageUtilization: That is, the set resource utilization rate, which triggers horizontal expansion when exceeding the set value

HPA currently supports three types of metrics. For details, please see Kubernetes Horizontal Pod Autoscaler:

Predefined metrics: CPU and memory usage of Pod (built-in support).
Custom Pod metrics: Monitoring indicators provided by the application (monitoring shall be deployed and metric server shall be customized).
Custom Object indicator: Monitoring indicators of other resources in the same namespace as Pod (need to deploy monitoring and customize metric server).

3.Add Load to php-apache Service and Verify Automatic Scale-up/down

Start a container and send infinite query requests to the php-apache server through a loop (run the following command on another terminal).

kubectl run -i --tty load-generator --image=busybox /bin/sh 
 
Hit enter for command prompt 
 
$ while true; do wget -q -O- http://php-apache.default.svc.cluster.local; done 
 
Output: 
OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!!OK!OK!OK!OK

4.Observe the Change of HPA

As the load increases, the number of replicas in the Deployment begins to increase.

[root@instance-2tpjy37t ~]# kubectl get hpa 
NAME         REFERENCE               TARGETS    MINPODS   MAXPODS   REPLICAS   AGE 
php-apache   Deployment/php-apache   332%/50%   1         10         7          19m 

[root@instance-2tpjy37t ~]# kubectl get deployment php-apache 
NAME         DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE 
php-apache   7         7         7            7           19m

5.Stop the Load

In the terminal of the container generating the load, input <Ctrl>+c to terminate the load generation. Then check the load status again (wait a few minutes).

[root@instance-2tpjy37t ~]# kubectl get hpa 
NAME          REFERENCE                      TARGET        MINPODS    MAXPODS    REPLICAS    AGE 
php-apache    Deployment/php-apache/scale    0%50%      1          10         1           11m 

# As the load decreases, so does the number of pods 
[root@instance-2tpjy37t ~]# kubectl get deployment php-apache 
NAME         DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE 
php-apache    1          1          1             1            27m

View Container Group

Create Working Load by Private Images