百度智能云

All Product Document

          Cloud Container Engine

          Configure Alarm Rules

          Alarm Overview

          CCE provide users with quick and visual alarm configuration based on the scheme of Prometheus+ alertmanager. Users can configure nodes and apply equidimensional alarm rules according to requirements. The alarm will be sent to the specified user or user group by email or SMS.

          Precondition

          1.A Kubernetes cluster has been deployed through CCE.

          2.The core container monitoring service prometheus (including alertmanager) has been deployed on the container monitoring page.

          Configure Alarm Rules

          Rule configuration consists of two steps: Rule configuration and global configuration.

          • Configure Rule: Configure alarm rules.
          • Global configuration: Used to route alarm rules to different receiving user or user group.

          Configure Entry

          Enter "Product Service > Cloud Container Engine (CCE)", click "Monitoring Log > Cloud Container Engine (CCE)" on the left navigation bar to enter the container monitoring page; click "Configure Alarm Rule Module Configure or in the line of alertmanager in the component list Configure Alarm.

          Configure Rule

          Enter the "Configure Rule" Tab page, as shown below.

          The rule list page enables you to view all alarm rules, add rules, delete or modify existing rules.

          Click "Create New Alarm Rule" to pop up the alarm rule configuration page, as shown below.

          According to the requirements configuration rules, the parameters are explained as follows:

          • Rule group: When there are too many alarm rules, you can define the grouping of rules.
          • Rule name: The name of the alarm rule is also the title in the alarm message.
          • Duration: The alarm will be sent only after the trigger condition has been triggered for a period of time, in seconds.
          • Expression: Fill in legal promsql statements, such as node_CPU>90, etc. Expression syntax can refer to: syntax rules .
          • Alarm Description: The alarm description can be customized, and the description information will be reflected in the body of the email. Refer to Syntax Reference for details , "Null" means no specific description.
          • Tags: Multiple custom labels can be configured for each rule to filter routes in the global configuration to match different alarm recipients.

          After configuration, click "OK" to submit. Reminder: Each new, modified or deleted operation takes about 60s to take effect.

          Global Configuration

          Enter the "Global Configuration" tab page, as shown below.

          In global configuration, you can view or configure routing rules and aggregation groups.

          Click "Create New Routing Rule" to pop up the routing rule configuration page, as shown below.

          Routing rules: It refers to the alarm receiver, sending interval, etc. that will be matched when the alarm is triggered (FIRE status).

          According to the requirements configuration rules, the parameters are explained as follows:

          • Matching rules: Corresponding to the label in each alarm rule, multiple alarm rules can be matched by the label and sent to the same group of recipients at the same sending frequency.
          • Send interval: The sending interval of alarm , in seconds.
          • Notification type: At present, it supports email alarm and SMS alarm.
          • Notification object: Users and user groups can be selected. Users can be divided into common sub-users and message recipients. Both types of users can authenticate mobile phone number and email address in the Identity and Access Management. After users or groups added, they must be authenticated to receive the alarms.

          Remind:

          1. In order to ensure the security of alarms, the number of alarms per minute for a single cluster or a single user should not be more than 100.
          2. If no email or SMS alarm is received, you can check whether you have set blocking rules, such as SMS shielding.
          3. The specific mailbox alarm configured by old users still work. Please contact the administrator if you have any problems

          Reminder: In order to ensure the security of alarm mails, the number of mails sent per minute by a single cluster shall not exceed 6.

          Aggregate grouping

          The aggregation group determines how to group the generated alarms. The alarms with the same grouping conditions will be combined into a group to send. When a large fault occurs (such as network fault), the number of alarms will be too many to locate quickly. The grouping can achieve the effect of noise reduction. The default aggregation group is alertname, that is, it is not grouped by default. Users can add or remove aggregation groups as required.

          Click Add aggregate group to configure in the input box that pops up.

          Example

          Aggregate all application exception alarms of a certain type of environment. If a large number of application exceptions are caused by network failure or other failures, all alarms will be combined into one alarm to send. Configuration steps:

          1.Add labels to all application exception alarm rule configurations in the test environment: env: test alert_type: app_down.

          2.Add labels to all application exception alarm rule configurations in the dev environment: env: dev alert_type: app_down.

          3.Add aggregation group label: Env and alert_type.

          Alarm sending:

          All alarms with the tag of env=test and alerttype=app_down are aggregated, that is, all application exception alarms in the test environment are sent in one message. Similarly, all application exception alarms in dev environment are sent in one message.

          Previous
          Monitoring Clusters by Prometheus
          Next
          Log Management