百度智能云

All Product Document

          Relational Database Service

          CPU Alarm Handling Method

          Background

          CPU occupancy rate: monitors RDS instance's CPU usage. If the CPU occupancy rate is high, it means the database pressure is high, usually leading to the following problems:

          • Database response turns slow, SQL execution consumes more time, and the request times out.
          • Database read/write QPS declines

          Baidu RDS database has adopted the CPU occupancy rate monitoring item, an alarm strategy added by default. If the CPU occupancy rate exceeds 95%, an alarm gets triggered by default. You can adjust the alarm threshold according to your own needs.

          The following content focuses on how to respond to and deal with CPU occupancy monitoring alarm if any.

          References: Monitoring and Alarm Operations Guide References: MySQL Slow Log Best Practices

          Problem Handling

          Find problem

          Channels for finding CPU occupancy rate problems include:

          1. If the CPU occupancy rate alarm threshold is triggered, the system sends alarm information.
          2. View RDS monitoring trend chart, and observe the CPU occupancy rate curve, as detailed below:
          3. If the time consumed rises and database response declines during the database access, it is possible that the CPU usage rate is abnormal, leading to long database response time.

          Locate problem

          • Step 1: log in to RDS instance using database account, execute the following command, view current process status, and check if there is expected long-running SQL:
          SHOW PROCESSLIST;

          For example, the following results are shown:

          | 10001 | baidu_dw | 127.0.0.1:39640 | baidu_dba | Query | 163 | Sending data | select t1.id,t1.data, t2.id,t2.data from tb_01 t1,tb_02 t2 where ... |
          | 10002 | baidu_dw | 127.0.0.1:39646 | baidu_dba | Query | 158 | Sending data | select t1.id,t1.data, t2.id,t2.data from tb_01 t1,tb_02 t2 where ... |
          | 10003 | baidu_dw | 127.0.0.1:39652 | baidu_dba | Query | 153 | Sending data | select t1.id,t1.data, t2.id,t2.data from tb_01 t1,tb_02 t2 where ... |
          | 10004 | baidu_dw | 127.0.0.1:39728 | baidu_dba | Query | 88  | Sending data | select t1.id,t1.data, t2.id,t2.data from tb_01 t1,tb_02 t2 where ... |
          | 10005 | baidu_dw | 192.168.0.69:39766 | baidu_dba | Sleep | 6 |
          • Step 2: analysis. From the current "processlist" display results, we can see four long-running slow SQLs, leading to CPU occupancy rate alarm.

          Resolve problem

          First, confirm if these slow queries may be ended. If yes, log in to RDS to execute "KILL" command:

          KILL threadID;

          At this moment, the client receives the following error information, which meets the expectation:

          ERROR 2006 (HY000): MySQL server has gone away 

          As mentioned above, observe the monitoring trend chart: CPU occupancy rate returns to normal.

          Previous
          RDS Monitoring and Alarm Configuration
          Next
          Large Transaction Alarm Handling Method