百度智能云

All Product Document

          Cloud Container Engine

          Setting of Automatic Horizontal Scaling of Working Load

          Introduction to HPA (Horizontal Pod Autoscaling)

          HPA can automatically scale the number of pods in a deployment or replica set according to CPU utilization or metrics provided by some applications, which can better deal with scenarios such as burst traffic.

          This article introduces how to use HPA to automatically expand and shrink applications in Cloud Container Engine (CCE).

          Verify HPA

          To demonstrate HPA, this article uses a custom Docker image based on php-apache image. The image includes an index.php page, which contains some code that can run CPU intensive computing tasks. The example Dockerfile is as follows:

          FROM php:5-apache 
          
          ADD index.php /var/www/html/index.php 
          
          RUN chmod a+rx index.php 

          The example index.php file is as follows:

          <?php 
          $x = 0.0001; 
          for ($i = 0; $i<= 1000000; $i++) { 
          	 $x += sqrt($x); 
          } 
          echo "OK!"; 
          ?> 

          Precondition

          1.A Kubernetes cluster has been deployed through CCE.

          2.Kubectl has been configured to access the Kubernetes cluster locally.

          1.Deployment and Operation

          First, deploy a Deployment to run the above Docker image and expose it as a Kubernetes service.

          kubectl run php-apache --image=hpa-example --requests=cpu=200m --expose --port=80 
          
          service "php-apache" created 
          deployment "php-apache" created 

          2.Create HPA

          kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10 
          deployment "php-apache" autoscaled 
          
          
          After creation, the current monitoring value is unknown. After step 3, wait for 1-2 minutes, and then it will become normal percentage 
          [root@instance-2tpjy37t ~]# kubectl get hpa 
          NAME         REFERENCE               TARGETS         MINPODS   MAXPODS   REPLICAS   AGE 
          php-apache   Deployment/php-apache  <unknown>/50%   1         10         1          5s 

          The corresponding YAML is as follows:

          apiVersion: autoscaling/v2alpha1 
          kind: HorizontalPodAutoscaler 
          metadata: 
            name: php-apache 
            namespace: default 
          spec: 
            scaleTargetRef: 
              apiVersion: apps/v1beta1 
              kind: Deployment 
              name: php-apache 
            minReplicas: 1 
            maxReplicas: 10 
            metrics: 
            - type: Resource 
              resource: 
                name: cpu 
                targetAverageUtilization: 50 

          Field Explanation:

          • scaleTargetRef : target object of HPA autoscaling
          • minReplicas : Minimum Pod quantity
          • maxReplicas : Maximum number of pods allowed
          • metrics : Metrics
          • targetAverageUtilization: That is, the set resource utilization rate, which triggers horizontal expansion when exceeding the set value

          HPA currently supports three types of metrics. For details, please see Kubernetes Horizontal Pod Autoscaler:

          • Predefined metrics: CPU and memory usage of Pod (built-in support).
          • Custom Pod metrics: Monitoring indicators provided by the application (monitoring shall be deployed and metric server shall be customized).
          • Custom Object indicator: Monitoring indicators of other resources in the same namespace as Pod (need to deploy monitoring and customize metric server).

          3.Add Load to php-apache Service and Verify Automatic Scale-up/down

          Start a container and send infinite query requests to the php-apache server through a loop (run the following command on another terminal).

          kubectl run -i --tty load-generator --image=busybox /bin/sh 
           
          Hit enter for command prompt 
           
          $ while true; do wget -q -O- http://php-apache.default.svc.cluster.local; done 
           
          Output: 
          OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!OK!!OK!OK!OK!OK 

          4.Observe the Change of HPA

          As the load increases, the number of replicas in the Deployment begins to increase.

          [root@instance-2tpjy37t ~]# kubectl get hpa 
          NAME         REFERENCE               TARGETS    MINPODS   MAXPODS   REPLICAS   AGE 
          php-apache   Deployment/php-apache   332%/50%   1         10         7          19m 
          
          [root@instance-2tpjy37t ~]# kubectl get deployment php-apache 
          NAME         DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE 
          php-apache   7         7         7            7           19m 

          5.Stop the Load

          In the terminal of the container generating the load, input <Ctrl>+c to terminate the load generation. Then check the load status again (wait a few minutes).

          [root@instance-2tpjy37t ~]# kubectl get hpa 
          NAME          REFERENCE                      TARGET        MINPODS    MAXPODS    REPLICAS    AGE 
          php-apache    Deployment/php-apache/scale    0%50%      1          10         1           11m 
          
          # As the load decreases, so does the number of pods 
          [root@instance-2tpjy37t ~]# kubectl get deployment php-apache 
          NAME         DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE 
          php-apache    1          1          1             1            27m 
          Previous
          View Container Group
          Next
          Create Working Load by Private Images