Cloud Product Event List
Updated at:2025-11-03
The event monitor of Baidu Cloud Monitor currently supports the following event information:
Baidu Cloud Compute (BCC)
| Event name in Chinese | Event type | Event Level | Solutions and suggestions |
|---|---|---|---|
| Hard disk medium error | RepairFpdmaQueuedFail | Critical | The hard disk medium of your instance ${InstanceName} has encountered an error. Please authorize the repair promptly to address the issue. |
| Inaccessible PCIe address of GPU | RepairGPUPanic | Critical | The PCIe address of the GPU in your instance ${InstanceName} is currently inaccessible. Please authorize the repair promptly to resolve the problem. |
| GPU remapping failed | RepairRemappingFailed | Critical | The GPU remapping of your instance ${InstanceName} has failed. Please authorize the repair promptly to fix the issue. |
| Too restrictive memory limit of GPU | RepairEccLimitExceeded | Critical | The GPU memory limit on your instance ${InstanceName} is too restrictive. Please authorize the repair promptly. |
| CPU overheating | RepairThermalTrip | Critical | The CPU temperature of your instance ${InstanceName} is excessively high. Please authorize the repair promptly. |
| Memory virtually locked | RepairADCBankVLS | Critical | The memory of your instance ${InstanceName} is nearly locked. Please authorize the repair promptly. |
| Wrong PCI address of network interface card | RepairNicPciAddr_None | Critical | Your instance ${InstanceName} has a PCI address error with its network interface card. Please authorize the repair promptly. |
| Network interface card asset missing | RepairNicAssetMissing | Critical | The network interface card asset for your instance ${InstanceName} is missing. Please authorize the repair promptly. |
| Several hard disks damaged | RepairTooMany | Critical | Several hard drives on your instance ${InstanceName} are damaged. Please authorize the repair promptly. |
| Hard disk SMART reached fault criteria | RepairSMARTFail | Critical | The SMART data for the hard disk of your instance ${InstanceName} indicates a fault. Please authorize the repair promptly. |
| Hard disk head or circuit error | RepairHardwareError | Critical | Your instance ${InstanceName} has a hard disk head or circuitry issue. Please authorize the repair promptly. |
| Hard disk missing | RepairMissing | Critical | The hard disk for your instance ${InstanceName} is missing. Please authorize the repair promptly. |
| Hard disk drive letter unreadable | RepairNotReady | Critical | The drive letter for the hard disk on your instance ${InstanceName} is unreadable. Please authorize the repair promptly. |
| NVMe controller abnormal | RepairNvmeCritalErr | Critical | The NVMe controller on your instance ${InstanceName} is not functioning properly. Please authorize the repair promptly. |
| NVMe medium abnormal | RepairNvmeMediaErr | Critical | The NVMe medium on your instance ${InstanceName} is malfunctioning. Please authorize the repair promptly. |
| NVMe disk missing | RepairNvmeMissing | Critical | The NVMe disk for your instance ${InstanceName} is missing. Please authorize the repair promptly. |
| GPU overheating | RepairGpuHighTemperature | Critical | The GPU temperature on your instance ${InstanceName} is too high. Please authorize the repair promptly. |
| GPU drop out | RepairGpuMissing | Critical | The GPU on your instance ${InstanceName} is missing. Please authorize the repair promptly. |
| GPU memory fault | RepairGpuEccErr | Critical | The GPU memory on your instance ${InstanceName} is faulty. Please authorize the repair promptly. |
| Physical disk prediction error count exceeds limit | RepairRaidPdPreErr | Critical | The predicted error count for the physical disk of your instance ${InstanceName} has exceeded the limit. Please authorize the repair promptly. |
| Other error count of physical disk exceeds limit | RepairRaidPdOtherErr | Critical | The count of other physical disk errors on your instance ${InstanceName} has exceeded the limit. Please authorize the repair promptly. |
| Physical disk configuration error | RepairRaidPdUB | Critical | Your instance ${InstanceName} has a configuration issue with its physical disk. Please authorize the repair promptly. |
| Physical disk fault | RepairRaidPdFailed | Critical | Your instance ${InstanceName} has a physical disk failure. Please authorize the repair promptly. |
| Physical disk missing | RepairRaidPdMissing | Critical | The physical disk of your instance ${InstanceName} is missing. Please authorize the repair promptly. |
| Unrecognizable physical disk | RepairRaidPdOffline | Critical | The physical disk of your instance ${InstanceName} is unrecognizable. Please authorize the repair promptly. |
| CPU configuration error | RepairConfigurationError | Critical | Your instance ${InstanceName} has a CPU configuration error. Please authorize the repair promptly. |
| CPU internal error | RepairIERR | Critical | Your instance ${InstanceName} has an internal CPU error. Please authorize the repair promptly. |
| BIOS self-check error | RepairFRB1/BISTfailure | Critical | Your instance ${InstanceName} has encountered a BIOS self-check error. Please authorize the repair promptly. |
| The machine got stuck when starting up | RepairFRB2/HanginPOST failure | Critical | The machine of your instance ${InstanceName} is stuck during startup. Please authorize the repair promptly. |
| CPU root node error | RepairSMBIOSUncorrectableCPU-complexError | Critical | Your instance ${InstanceName} has a CPU root node error. Please authorize the repair promptly. |
| CPU MCE fault | RepairUncorrectablemachinecheckexception | Critical | Your instance ${InstanceName} has experienced a CPU MCE fault. Please authorize the repair promptly. |
| CPU detection error | RepairCorrectablemachinecheckerror | Critical | Your instance ${InstanceName} has a CPU detection error. Please authorize the repair promptly. |
| Memory fault | RepairDIMMUE | Critical | Your instance ${InstanceName} has encountered a memory fault. Please authorize the repair promptly. |
| Memory ECC error | RepairUncorrectableECC | Critical | Your instance ${InstanceName} has a memory ECC error. Please authorize the repair promptly. |
| Memory cleanup failure | RepairMemoryScrubFailed | Critical | The memory cleanup operation for your instance ${InstanceName} failed. Please authorize the repair promptly. |
| Memory parity error | RepairParity | Critical | Your instance ${InstanceName} has a memory parity error. Please authorize the repair promptly. |
| Memory isolation fault | RepairMemoryDeviceDisabled | Critical | Your instance ${InstanceName} has a memory isolation fault. Please authorize the repair promptly. |
| Too many recoverable memory ECC errors | RepairDIMMCELarge | Critical | Your instance ${InstanceName} has an excessive number of recoverable memory ECC errors. Please authorize the repair promptly. |
| Too many correctable memory errors | RepairCorrectableECClogginglimitreached | Critical | Your instance ${InstanceName} has too many correctable memory ECC errors. Please authorize the repair promptly. |
| Fan position error | RepairDeviceAbsent | Critical | Your instance ${InstanceName} has a fan position error. Please authorize the repair promptly. |
| Correctable bus error | RepairBusCorrectableerror | Critical | Your instance ${InstanceName} has experienced a correctable bus error. Please authorize the repair promptly. |
| Uncorrectable bus error | RepairBusUncorrectableerror | Critical | Your instance ${InstanceName} has encountered an uncorrectable bus error. Please authorize the repair promptly. |
| Bus critical error | RepairBusFatalError | Critical | Your instance ${InstanceName} has a critical bus error. Please authorize the repair promptly. |
| PCI parity error | RepairPCIPERR | Critical | Your instance ${InstanceName} has encountered a PCI parity error; please approve the repair promptly. |
| Mainboard bus error | RepairTransitiontoCriticalfromlesssevere | Critical | Your instance ${InstanceName} has encountered a mainboard bus error; please approve the repair promptly. |
| Mainboard unrecoverable | RepairTransitiontoNon-recoverablefromlesssevere | Critical | The mainboard of your instance ${InstanceName} is irreparable; please approve the repair promptly. |
| System hardware fault | RepairUndeterminedsystemhardwarefailure | Critical | Your instance ${InstanceName} has a system hardware issue; please approve the repair promptly. |
| Mainboard fault | RepairUnrecoverablesystem-boardfailure | Critical | Your instance ${InstanceName} has encountered a mainboard issue; please approve the repair promptly. |
| Memory fault | RepairUnrecoverablevideocontrollerfailure | Critical | Your instance ${InstanceName} has encountered a memory issue; please approve the repair promptly. |
| Smart card abnormal | RepairSNICError | Critical | The smart card in your instance ${InstanceName} is malfunctioning; please approve the repair promptly. |
| Failure to read GPU configuration space | RepairPciHeader_None | Critical | The GPU configuration space of your instance ${InstanceName} cannot be read; please approve the repair promptly. |
| Network interface card abnormal | RepairBCCInstanceNicLinkDownFAIL | Critical | The network interface card in your instance ${InstanceName} is malfunctioning; please approve the repair promptly. |
| Crashing warning | RepairDIMMServerCrash | Critical | Your instance ${InstanceName} is at risk of a crash; please approve the repair promptly. |
| RAID card BBU fault | RepairRaidBBUFailed | Critical | Your instance ${InstanceName} has a RAID card BBU issue; please approve the repair promptly. |
| Downgraded logical disk due to the damage to the physical disk | RepairRaidVdDegraded | Critical | The logical disk of your instance ${InstanceName} has degraded due to physical disk damage; please approve the repair promptly. |
| Physical disk reached end of life | RepairRaidPdWearOut | Critical | The physical disk in your instance ${InstanceName} has reached the end of its lifespan; please approve the repair promptly. |
| Hard disk in DStatus | RepairDStatus | Critical | The hard disk in your instance ${InstanceName} is in DStatus; please approve the repair promptly. |
| Hard disk reached end of life | RepairSSDWearOut | Critical | The hard drive in your instance ${InstanceName} has reached the end of its lifespan; please approve the repair promptly. |
| Hard disk warning | RepairFARMPredict | Critical | Your instance ${InstanceName} has triggered a hard disk alert; please approve the repair promptly. |
| NVME write/erase cycles worn out | RepairNvmeWearOut | Critical | The NVME write/erase cycles in your instance ${InstanceName} are worn out; please approve the repair promptly. |
| Correctable errors reached threshold | RepairCorrectablememoryerrorlogginglimitreached | Critical | The correctable errors in your instance ${InstanceName} have exceeded the threshold; please approve the repair promptly. |
| Device error | RepairDeviceFault | Critical | Your instance ${InstanceName} has encountered a device error; please approve the repair promptly. |
| Power fault | RepairFailuredetected | Critical | Your instance ${InstanceName} has encountered a power issue; please approve the repair promptly. |
| RAID array error | RepairInCriticalArray | Critical | Your instance ${InstanceName} has encountered a RAID array error. Please authorize the repair promptly. |
| RAID array downgraded | RepairInFailedArray | Critical | The RAID array on your instance ${InstanceName} has degraded. Please authorize the repair promptly. |
| Installation error | RepairInstallError | Critical | An installation error has occurred on your instance ${InstanceName}. Please authorize the repair promptly. |
| Installation failed | RepairInstallationfailed | Critical | The installation process for your instance ${InstanceName} could not be completed. Please authorize the repair promptly. |
| PCI system error | RepairPCISERR | Critical | Your instance ${InstanceName} has a PCI system error. Please authorize the repair promptly. |
| Power fault detected | RepairPowerSupplyFailuredetected | Critical | A power fault has been detected on your instance ${InstanceName}. Please authorize the repair promptly. |
| Early alert | RepairPredictivefailure | Critical | An early alert has been triggered for your instance ${InstanceName}. Please authorize the repair promptly. |
| Uncorrectable memory ECC error | RepairUncorrectablememoryerror | Critical | Your instance ${InstanceName} has an uncorrectable ECC memory error. Please authorize the repair promptly. |
| Irrecoverable IDE device fault | RepairUnrecoverableIDEdevicefailure | Critical | Your instance ${InstanceName} has encountered an irrecoverable IDE device fault. Please authorize the repair promptly. |
| Hard disk sector damaged | BadSector | Warning | The hard disk sector on your instance ${InstanceName} is damaged. Please be aware of potential impacts on the applications running on this instance. |
| Data exceeded limit | TooLarge | Warning | The data on your instance ${InstanceName} has exceeded the allowable limit. Please be aware of potential impacts on the applications running on this instance. |
| Double bit error in memory | EccError | Warning | Your instance ${InstanceName} has a double-bit memory error. Please be aware of potential impacts on the applications running on this instance. |
| GPU bus error | GpusBusErr | Warning | Your instance ${InstanceName} has encountered a GPU bus error. Please be aware of potential impacts on the applications running on this instance. |
| GPU drive error | DriverError | Warning | Your instance ${InstanceName} has encountered a GPU driver error. Please be aware of potential impacts on the applications running on this instance. |
| Unrecognizable GPU | RmInitAdapterFailed | Warning | The GPU on your instance ${InstanceName} is not recognized. Please be aware of potential impacts on the applications running on this instance. |
| GPU memory error check not enabled | EccDisable | Warning | The GPU memory error-checking feature is disabled on your instance ${InstanceName}. Please be aware of potential impacts on the applications running on this instance. |
| GPU system management interruption timeout | SmiTimeout | Warning | The GPU system management interruption on your instance ${InstanceName} has timed out. Please be aware of potential impacts on the applications running on this instance. |
| GPU power consumption abnormal | PowerError | Warning | The GPU power consumption on your instance ${InstanceName} is abnormal. Please be aware of potential impacts on the applications running on this instance. |
| GPU interconnection service abnormal | FabricManagerNotRunning | Warning | The GPU interconnection service on your instance ${InstanceName} is operating abnormally. Please be aware of potential impacts on the applications running on this instance. |
| NVLink connection of the GPU interrupted | NvlinkInactive | Warning | The GPU NVLink connection on your instance ${InstanceName} has been interrupted. Please be aware of potential impacts on the applications running on this instance. |
| GPU microcontroller abnormal | Xid62 | Warning | The GPU microcontroller in your instance ${InstanceName} is behaving abnormally. Please monitor it and assess any potential impact on this instance's applications. |
| The GPU isolation mapping record failed | Xid64 | Warning | The GPU isolation mapping record for your instance ${InstanceName} has failed. Please monitor it and assess any potential impact on this instance's applications. |
| ECC error not included in the GPU | Xid95 | Warning | No ECC error has been detected in the GPU of your instance ${InstanceName}. Please monitor it and assess any potential impact on this instance's applications. |
| GPU Context switch timeout | Xid109 | Warning | A context switch timeout has occurred in the GPU of your instance ${InstanceName}. Please monitor it and assess any potential impact on this instance's applications. |
| GPU microcode error | StateExpection | Warning | A GPU microcode error has been detected in your instance ${InstanceName}. Please monitor it and assess any potential impact on this instance's applications. |
| GPU status error | StateError | Warning | Your instance ${InstanceName} has encountered a GPU status error. Please monitor it and assess any potential impact on this instance's applications. |
| GPU check timeout | CheckTimeout | Warning | The GPU check for your instance ${InstanceName} has timed out. Please monitor it and assess any potential impact on this instance's applications. |
| Raid virtual disk abnormal | RaidVdNotReady | Warning | The RAID virtual disk in your instance ${InstanceName} is experiencing an anomaly. Please monitor it and assess any potential impact on this instance's applications. |
| Raid virtual disk timeout | RaidVdWT | Warning | Your instance ${InstanceName} has encountered a RAID virtual disk timeout event. Please monitor it and assess any potential impact on this instance's applications. |
| Raid virtual disk inaccessible | RaidVdNra | Warning | The RAID virtual disk in your instance ${InstanceName} is currently inaccessible. Please monitor it and assess any potential impact on this instance's applications. |
| Raid virtual disk inaccessible | RaidPdMediaErr | Warning | The RAID virtual disk in your instance ${InstanceName} is currently inaccessible. Please monitor it and assess any potential impact on this instance's applications. |
| Raid physical disk medium fault | RaidPdUG | Warning | A fault has been detected in the RAID physical disk medium of your instance ${InstanceName}. Please monitor it and assess any potential impact on this instance's applications. |
| BBU cache is missing in the Raid physical disk | RaidVdNoBBUCacheErr | Warning | The BBU cache is missing in the RAID virtual disk of your instance ${InstanceName}. Please monitor it and assess any potential impact on this instance's applications. |
| OCSSD permission error | Permission_wrong | Warning | Your instance ${InstanceName} has encountered an OCSSD permission error. Please monitor it and assess any potential impact on this instance's applications. |
| CPU self-check error | FRB2/HanginPOSTfailure | Warning | A CPU self-check error has been detected in your instance ${InstanceName}. Please monitor it and assess any potential impact on this instance's applications. |
| CPU frequency too low | CPUFreqLow | Warning | The CPU frequency of your instance ${InstanceName} is unusually low. Please monitor it and assess any potential impact on this instance's applications. |
| Network interface card connection unstable | NicLinkFlutter | Warning | The network interface card connection in your instance ${InstanceName} is unstable. Please monitor it and assess any potential impact on the applications running on this instance. |
| Network interface card overheating | NicHighTemp | Warning | The network interface card in your instance ${InstanceName} is overheating. Please monitor it and assess any potential impact on this instance's applications. |
| Network interface card adapter disconnected | NicAdRtDis(H800) | Warning | The network interface card adapter in your instance ${InstanceName} has disconnected. Please monitor it and assess any potential impact on this instance's applications. |
| IP address of the network interface card duplicated | NicNonSingleIpErr(SCI) | Warning | The IP address of the network interface card in your instance ${InstanceName} is duplicated. Please monitor it and assess any potential impact on this instance's applications. |
| Machine overheating | UpperCriticalgoinghigh | Warning | The machine hosting your instance ${InstanceName} is overheating. Please monitor it and assess any potential impact on this instance's applications. |
| Fan speed extremely low | LowerCriticalgoinglow | Warning | The fan speed on your instance ${InstanceName} is exceptionally low. Please monitor this closely as it may affect the application running on this instance. |
| Fan speed relatively low | LowerNon-criticalgoinglow | Warning | The fan speed on your instance ${InstanceName} is somewhat low. Please be cautious as it could impact the application running on this instance. |
| Mainboard fault unrecoverable | TransitiontoNon-recoverablefromlesssever | Warning | The mainboard issue on your instance ${InstanceName} is irreparable. Please take note of this as it may affect the application running on this instance. |
| Mainboard system firmware error | SystemFirmwareError | Warning | Your instance ${InstanceName} is experiencing a system firmware error on the mainboard. Please monitor this as it could affect the application running on this instance. |
| Mainboard diagnostics interrupted | NMI/DiagInterrupt | Warning | The diagnostic process for the mainboard on your instance ${InstanceName} has been interrupted. Please be aware of this as it might impact the application running on this instance. |
| Mainboard status downgraded | TransitiontoDegraded | Warning | The status of the mainboard on your instance ${InstanceName} has deteriorated. Please keep this in mind as it could affect the application running on this instance. |
| Hard disk backplane drive error | DriveFault | Warning | Your instance ${InstanceName} is encountering a hard disk backplane drive error. Please pay attention to this as it may affect the application operating on this instance. |
| PCIe link slowdown | LaneDrop | Warning | The PCIe link on your instance ${InstanceName} has slowed down. Please be aware of this as it might impact the application running on this instance. |
| PCIe link bandwidth decreased | BWDrop | Warning | The PCIe link bandwidth on your instance ${InstanceName} has reduced. Please monitor this as it may affect the application running on this instance. |
| Hard disk medium error | MediaError | Warning | Your instance ${InstanceName} is experiencing a hard disk medium error. Please pay attention to this as it could affect the application running on this instance. |
| Hard disk IO error | IOError | Warning | Your instance ${InstanceName} is encountering a hard disk IO error. Please be cautious as this might impact the application running on this instance. |
| NVME cannot read SMART information | NvmeSmartFail_None | Warning | The NVME drive on your instance ${InstanceName} is unable to access SMART information. Please be aware of this as it may affect the application running on this instance. |
| GPU xid13 error | Xid13 | Warning | Your instance ${InstanceName} is experiencing a GPU xid13 error. Please observe this issue carefully as it might impact the application running on this instance. |
| GPU xid31 error | Xid31 | Warning | Your instance ${InstanceName} is encountering a GPU xid31 error. Please take note of this as it could affect the application running on this instance. |
| GPU xid32 error | Xid32 | Warning | Your instance ${InstanceName} has a GPU xid32 error. Please monitor this closely as it might impact the application running on this instance. |
| GPU xid43 error | Xid43 | Warning | Your instance ${InstanceName} is experiencing a GPU xid43 error. Please pay attention to this as it might affect the application running on this instance. |
| GPU xid45 error | Xid45 | Warning | Your instance ${InstanceName} is encountering a GPU xid45 error. Please be aware of this as it may influence the application running on this instance. |
| GPU xid48 error | Xid48 | Warning | Your instance ${InstanceName} has a GPU xid48 error. Please monitor this issue as it could affect the application running on this instance. |
| GPU xid63 error | Xid63 | Warning | Your instance ${InstanceName} has a GPU xid63 error. Please observe this carefully, as it might impact the application running on this instance. |
| GPU ECC error | EccErr | Warning | Your instance ${InstanceName} is experiencing a GPU ECC error. Please keep an eye on this issue as it could affect the application running on this instance. |
| GPU xid74 error | Xid74 | Warning | Your instance ${InstanceName} has encountered a GPU xid74 error. Please take note and consider the potential impact on the application running on this instance. |
| GPU xid79 error | Xid79 | Warning | Your instance ${InstanceName} has encountered a GPU xid79 error. Please take note and consider the potential impact on the application running on this instance. |
| CPU cache read/write error | CPUCacheErr | Warning | Your instance ${InstanceName} has a CPU cache read/write error. Please take note and consider the potential impact on the application running on this instance. |
| CPU frequency decreased | Throttled | Warning | The CPU frequency of your instance ${InstanceName} has dropped. Please take note and consider the potential impact on the application running on this instance. |
| Network interface card connection abnormal | NicLinkDown | Warning | The network interface card of your instance ${InstanceName} is experiencing issues. Please take note and consider the potential impact on the application running on this instance. |
| Network interface card slowdown | NicSpeedLow | Warning | The speed of the network interface card on your instance ${InstanceName} has slowed down. Please take note and consider the potential impact on the application running on this instance. |
| Network interface card bus error _ bandwidth decreased | NicBusErr_BW | Warning | Your instance ${InstanceName} has encountered a network interface card bus error, leading to decreased bandwidth. Please take note and consider the potential impact on the application running on this instance. |
| Network interface card bus error _ slowdown | NicBusErr_SP | Warning | Your instance ${InstanceName} has encountered a network interface card bus error, causing a slowdown. Please take note and consider the potential impact on the application running on this instance. |
| Network port CRC error | NICCRC | Warning | Your instance ${InstanceName} has a network port CRC error. Please take note and consider the potential impact on the application running on this instance. |
| Low output power of optical module | tx_power_is_low_alarm | Warning | The output power of the optical module in your instance ${InstanceName} is low. Please take note and consider the potential impact on the application running on this instance. |
| Low current of optical module | bias_cur_is_low_alarm | Warning | The current of the optical module in your instance ${InstanceName} is low. Please take note and consider the potential impact on the application running on this instance. |
| Low input power of optical module | rx_power_is_low_alarm | Warning | The input power of the optical module in your instance ${InstanceName} is low. Please take note and consider the potential impact on the application running on this instance. |
| Memory error | DIMMCE | Warning | Your instance ${InstanceName} has encountered a memory error. Please take note and consider the potential impact on the application running on this instance. |
| Too many recoverable memory errors | CorrectableECC | Warning | Your instance ${InstanceName} has too many recoverable memory errors. Please take note and consider the potential impact on the application running on this instance. |
Elastic Baremetal Compute BBC
| Event name in Chinese | Event type | Event Level | Solutions and suggestions |
|---|---|---|---|
| Hard disk medium error | RepairFpdmaQueuedFail | Critical | The hard disk medium of your instance ${InstanceName} has encountered an error. Please authorize the repair promptly to address the issue. |
| Inaccessible PCIe address of GPU | RepairGPUPanic | Critical | The PCIe address of the GPU in your instance ${InstanceName} is currently inaccessible. Please authorize the repair promptly to resolve the problem. |
| GPU remapping failed | RepairRemappingFailed | Critical | The GPU remapping of your instance ${InstanceName} has failed. Please authorize the repair promptly to fix the issue. |
| Too restrictive memory limit of GPU | RepairEccLimitExceeded | Critical | The GPU memory limit on your instance ${InstanceName} is too restrictive. Please authorize the repair promptly. |
| CPU overheating | RepairThermalTrip | Critical | The CPU temperature of your instance ${InstanceName} is excessively high. Please authorize the repair promptly. |
| Memory virtually locked | RepairADCBankVLS | Critical | The memory of your instance ${InstanceName} is nearly locked. Please authorize the repair promptly. |
| Wrong PCI address of network interface card | RepairNicPciAddr_None | Critical | Your instance ${InstanceName} has a PCI address error with its network interface card. Please authorize the repair promptly. |
| Network interface card asset missing | RepairNicAssetMissing | Critical | The network interface card asset for your instance ${InstanceName} is missing. Please authorize the repair promptly. |
| Several hard disks damaged | RepairTooMany | Critical | Several hard drives on your instance ${InstanceName} are damaged. Please authorize the repair promptly. |
| Hard disk SMART reached fault criteria | RepairSMARTFail | Critical | The SMART data for the hard disk of your instance ${InstanceName} indicates a fault. Please authorize the repair promptly. |
| Hard disk head or circuit error | RepairHardwareError | Critical | Your instance ${InstanceName} has a hard disk head or circuitry issue. Please authorize the repair promptly. |
| Hard disk missing | RepairMissing | Critical | The hard disk for your instance ${InstanceName} is missing. Please authorize the repair promptly. |
| Hard disk drive letter unreadable | RepairNotReady | Critical | The drive letter for the hard disk on your instance ${InstanceName} is unreadable. Please authorize the repair promptly. |
| NVMe controller abnormal | RepairNvmeCritalErr | Critical | The NVMe controller on your instance ${InstanceName} is not functioning properly. Please authorize the repair promptly. |
| NVMe medium abnormal | RepairNvmeMediaErr | Critical | The NVMe medium on your instance ${InstanceName} is malfunctioning. Please authorize the repair promptly. |
| NVMe disk missing | RepairNvmeMissing | Critical | The NVMe disk for your instance ${InstanceName} is missing. Please authorize the repair promptly. |
| GPU overheating | RepairGpuHighTemperature | Critical | The GPU temperature on your instance ${InstanceName} is too high. Please authorize the repair promptly. |
| GPU drop out | RepairGpuMissing | Critical | The GPU on your instance ${InstanceName} is missing. Please authorize the repair promptly. |
| GPU memory fault | RepairGpuEccErr | Critical | The GPU memory on your instance ${InstanceName} is faulty. Please authorize the repair promptly. |
| Physical disk prediction error count exceeds limit | RepairRaidPdPreErr | Critical | The predicted error count for the physical disk of your instance ${InstanceName} has exceeded the limit. Please authorize the repair promptly. |
| Other error count of physical disk exceeds limit | RepairRaidPdOtherErr | Critical | The count of other physical disk errors on your instance ${InstanceName} has exceeded the limit. Please authorize the repair promptly. |
| Physical disk configuration error | RepairRaidPdUB | Critical | Your instance ${InstanceName} has a configuration issue with its physical disk. Please authorize the repair promptly. |
| Physical disk fault | RepairRaidPdFailed | Critical | Your instance ${InstanceName} has a physical disk failure. Please authorize the repair promptly. |
| Physical disk missing | RepairRaidPdMissing | Critical | The physical disk of your instance ${InstanceName} is missing. Please authorize the repair promptly. |
| Unrecognizable physical disk | RepairRaidPdOffline | Critical | The physical disk of your instance ${InstanceName} is unrecognizable. Please authorize the repair promptly. |
| CPU configuration error | RepairConfigurationError | Critical | Your instance ${InstanceName} has a CPU configuration error. Please authorize the repair promptly. |
| CPU internal error | RepairIERR | Critical | Your instance ${InstanceName} has an internal CPU error. Please authorize the repair promptly. |
| BIOS self-check error | RepairFRB1/BISTfailure | Critical | Your instance ${InstanceName} has encountered a BIOS self-check error. Please authorize the repair promptly. |
| The machine got stuck when starting up | RepairFRB2/HanginPOST failure | Critical | The machine of your instance ${InstanceName} is stuck during startup. Please authorize the repair promptly. |
| CPU root node error | RepairSMBIOSUncorrectableCPU-complexError | Critical | Your instance ${InstanceName} has a CPU root node error. Please authorize the repair promptly. |
| CPU MCE fault | RepairUncorrectablemachinecheckexception | Critical | Your instance ${InstanceName} has experienced a CPU MCE fault. Please authorize the repair promptly. |
| CPU detection error | RepairCorrectablemachinecheckerror | Critical | Your instance ${InstanceName} has a CPU detection error. Please authorize the repair promptly. |
| Memory fault | RepairDIMMUE | Critical | Your instance ${InstanceName} has encountered a memory fault. Please authorize the repair promptly. |
| Memory ECC error | RepairUncorrectableECC | Critical | Your instance ${InstanceName} has a memory ECC error. Please authorize the repair promptly. |
| Memory cleanup failure | RepairMemoryScrubFailed | Critical | The memory cleanup operation for your instance ${InstanceName} failed. Please authorize the repair promptly. |
| Memory parity error | RepairParity | Critical | Your instance ${InstanceName} has a memory parity error. Please authorize the repair promptly. |
| Memory isolation fault | RepairMemoryDeviceDisabled | Critical | Your instance ${InstanceName} has a memory isolation fault. Please authorize the repair promptly. |
| Too many recoverable memory ECC errors | RepairDIMMCELarge | Critical | Your instance ${InstanceName} has an excessive number of recoverable memory ECC errors. Please authorize the repair promptly. |
| Too many correctable memory errors | RepairCorrectableECClogginglimitreached | Critical | Your instance ${InstanceName} has too many correctable memory ECC errors. Please authorize the repair promptly. |
| Fan position error | RepairDeviceAbsent | Critical | Your instance ${InstanceName} has a fan position error. Please authorize the repair promptly. |
| Correctable bus error | RepairBusCorrectableerror | Critical | Your instance ${InstanceName} has experienced a correctable bus error. Please authorize the repair promptly. |
| Uncorrectable bus error | RepairBusUncorrectableerror | Critical | Your instance ${InstanceName} has encountered an uncorrectable bus error. Please authorize the repair promptly. |
| Bus critical error | RepairBusFatalError | Critical | Your instance ${InstanceName} has a critical bus error. Please authorize the repair promptly. |
| PCI parity error | RepairPCIPERR | Critical | Your instance ${InstanceName} has encountered a PCI parity error; please approve the repair promptly. |
| Mainboard bus error | RepairTransitiontoCriticalfromlesssevere | Critical | Your instance ${InstanceName} has encountered a mainboard bus error; please approve the repair promptly. |
| Mainboard unrecoverable | RepairTransitiontoNon-recoverablefromlesssevere | Critical | The mainboard of your instance ${InstanceName} is irreparable; please approve the repair promptly. |
| System hardware fault | RepairUndeterminedsystemhardwarefailure | Critical | Your instance ${InstanceName} has a system hardware issue; please approve the repair promptly. |
| Mainboard fault | RepairUnrecoverablesystem-boardfailure | Critical | Your instance ${InstanceName} has encountered a mainboard issue; please approve the repair promptly. |
| Memory fault | RepairUnrecoverablevideocontrollerfailure | Critical | Your instance ${InstanceName} has encountered a memory issue; please approve the repair promptly. |
| Smart card abnormal | RepairSNICError | Critical | The smart card in your instance ${InstanceName} is malfunctioning; please approve the repair promptly. |
| Failure to read GPU configuration space | RepairPciHeader_None | Critical | The GPU configuration space of your instance ${InstanceName} cannot be read; please approve the repair promptly. |
| Network interface card abnormal | RepairBCCInstanceNicLinkDownFAIL | Critical | The network interface card in your instance ${InstanceName} is malfunctioning; please approve the repair promptly. |
| Crashing warning | RepairDIMMServerCrash | Critical | Your instance ${InstanceName} is at risk of a crash; please approve the repair promptly. |
| RAID card BBU fault | RepairRaidBBUFailed | Critical | Your instance ${InstanceName} has a RAID card BBU issue; please approve the repair promptly. |
| Downgraded logical disk due to the damage to the physical disk | RepairRaidVdDegraded | Critical | The logical disk of your instance ${InstanceName} has degraded due to physical disk damage; please approve the repair promptly. |
| Physical disk reached end of life | RepairRaidPdWearOut | Critical | The physical disk in your instance ${InstanceName} has reached the end of its lifespan; please approve the repair promptly. |
| Hard disk in DStatus | RepairDStatus | Critical | The hard disk in your instance ${InstanceName} is in DStatus; please approve the repair promptly. |
| Hard disk reached end of life | RepairSSDWearOut | Critical | The hard drive in your instance ${InstanceName} has reached the end of its lifespan; please approve the repair promptly. |
| Hard disk warning | RepairFARMPredict | Critical | Your instance ${InstanceName} has triggered a hard disk alert; please approve the repair promptly. |
| NVME write/erase cycles worn out | RepairNvmeWearOut | Critical | The NVME write/erase cycles in your instance ${InstanceName} are worn out; please approve the repair promptly. |
| Correctable errors reached threshold | RepairCorrectablememoryerrorlogginglimitreached | Critical | The correctable errors in your instance ${InstanceName} have exceeded the threshold; please approve the repair promptly. |
| Device error | RepairDeviceFault | Critical | Your instance ${InstanceName} has encountered a device error; please approve the repair promptly. |
| Power fault | RepairFailuredetected | Critical | Your instance ${InstanceName} has encountered a power issue; please approve the repair promptly. |
| RAID array error | RepairInCriticalArray | Critical | Your instance ${InstanceName} has encountered a RAID array error. Please authorize the repair promptly. |
| RAID array downgraded | RepairInFailedArray | Critical | The RAID array on your instance ${InstanceName} has degraded. Please authorize the repair promptly. |
| Installation error | RepairInstallError | Critical | An installation error has occurred on your instance ${InstanceName}. Please authorize the repair promptly. |
| Installation failed | RepairInstallationfailed | Critical | The installation process for your instance ${InstanceName} could not be completed. Please authorize the repair promptly. |
| PCI system error | RepairPCISERR | Critical | Your instance ${InstanceName} has a PCI system error. Please authorize the repair promptly. |
| Power fault detected | RepairPowerSupplyFailuredetected | Critical | A power fault has been detected on your instance ${InstanceName}. Please authorize the repair promptly. |
| Early alert | RepairPredictivefailure | Critical | An early alert has been triggered for your instance ${InstanceName}. Please authorize the repair promptly. |
| Uncorrectable memory ECC error | RepairUncorrectablememoryerror | Critical | Your instance ${InstanceName} has an uncorrectable ECC memory error. Please authorize the repair promptly. |
| Irrecoverable IDE device fault | RepairUnrecoverableIDEdevicefailure | Critical | Your instance ${InstanceName} has encountered an irrecoverable IDE device fault. Please authorize the repair promptly. |
| Hard disk sector damaged | BadSector | Warning | The hard disk sector on your instance ${InstanceName} is damaged. Please be aware of potential impacts on the applications running on this instance. |
| Data exceeded limit | TooLarge | Warning | The data on your instance ${InstanceName} has exceeded the allowable limit. Please be aware of potential impacts on the applications running on this instance. |
| Double bit error in memory | EccError | Warning | Your instance ${InstanceName} has a double-bit memory error. Please be aware of potential impacts on the applications running on this instance. |
| GPU bus error | GpusBusErr | Warning | Your instance ${InstanceName} has encountered a GPU bus error. Please be aware of potential impacts on the applications running on this instance. |
| GPU drive error | DriverError | Warning | Your instance ${InstanceName} has encountered a GPU driver error. Please be aware of potential impacts on the applications running on this instance. |
| Unrecognizable GPU | RmInitAdapterFailed | Warning | The GPU on your instance ${InstanceName} is not recognized. Please be aware of potential impacts on the applications running on this instance. |
| GPU memory error check not enabled | EccDisable | Warning | The GPU memory error-checking feature is disabled on your instance ${InstanceName}. Please be aware of potential impacts on the applications running on this instance. |
| GPU system management interruption timeout | SmiTimeout | Warning | The GPU system management interruption on your instance ${InstanceName} has timed out. Please be aware of potential impacts on the applications running on this instance. |
| GPU power consumption abnormal | PowerError | Warning | The GPU power consumption on your instance ${InstanceName} is abnormal. Please be aware of potential impacts on the applications running on this instance. |
| GPU interconnection service abnormal | FabricManagerNotRunning | Warning | The GPU interconnection service on your instance ${InstanceName} is operating abnormally. Please be aware of potential impacts on the applications running on this instance. |
| NVLink connection of the GPU interrupted | NvlinkInactive | Warning | The GPU NVLink connection on your instance ${InstanceName} has been interrupted. Please be aware of potential impacts on the applications running on this instance. |
| GPU microcontroller abnormal | Xid62 | Warning | The GPU microcontroller in your instance ${InstanceName} is behaving abnormally. Please monitor it and assess any potential impact on this instance's applications. |
| The GPU isolation mapping record failed | Xid64 | Warning | The GPU isolation mapping record for your instance ${InstanceName} has failed. Please monitor it and assess any potential impact on this instance's applications. |
| ECC error not included in the GPU | Xid95 | Warning | No ECC error has been detected in the GPU of your instance ${InstanceName}. Please monitor it and assess any potential impact on this instance's applications. |
| GPU Context switch timeout | Xid109 | Warning | A context switch timeout has occurred in the GPU of your instance ${InstanceName}. Please monitor it and assess any potential impact on this instance's applications. |
| GPU microcode error | StateExpection | Warning | A GPU microcode error has been detected in your instance ${InstanceName}. Please monitor it and assess any potential impact on this instance's applications. |
| GPU status error | StateError | Warning | Your instance ${InstanceName} has encountered a GPU status error. Please monitor it and assess any potential impact on this instance's applications. |
| GPU check timeout | CheckTimeout | Warning | The GPU check for your instance ${InstanceName} has timed out. Please monitor it and assess any potential impact on this instance's applications. |
| Raid virtual disk abnormal | RaidVdNotReady | Warning | The RAID virtual disk in your instance ${InstanceName} is experiencing an anomaly. Please monitor it and assess any potential impact on this instance's applications. |
| Raid virtual disk timeout | RaidVdWT | Warning | Your instance ${InstanceName} has encountered a RAID virtual disk timeout event. Please monitor it and assess any potential impact on this instance's applications. |
| Raid virtual disk inaccessible | RaidVdNra | Warning | The RAID virtual disk in your instance ${InstanceName} is currently inaccessible. Please monitor it and assess any potential impact on this instance's applications. |
| Raid virtual disk inaccessible | RaidPdMediaErr | Warning | The RAID virtual disk in your instance ${InstanceName} is currently inaccessible. Please monitor it and assess any potential impact on this instance's applications. |
| Raid physical disk medium fault | RaidPdUG | Warning | A fault has been detected in the RAID physical disk medium of your instance ${InstanceName}. Please monitor it and assess any potential impact on this instance's applications. |
| BBU cache is missing in the Raid physical disk | RaidVdNoBBUCacheErr | Warning | The BBU cache is missing in the RAID virtual disk of your instance ${InstanceName}. Please monitor it and assess any potential impact on this instance's applications. |
| OCSSD permission error | Permission_wrong | Warning | Your instance ${InstanceName} has encountered an OCSSD permission error. Please monitor it and assess any potential impact on this instance's applications. |
| CPU self-check error | FRB2/HanginPOSTfailure | Warning | A CPU self-check error has been detected in your instance ${InstanceName}. Please monitor it and assess any potential impact on this instance's applications. |
| CPU frequency too low | CPUFreqLow | Warning | The CPU frequency of your instance ${InstanceName} is unusually low. Please monitor it and assess any potential impact on this instance's applications. |
| Network interface card connection unstable | NicLinkFlutter | Warning | The network interface card connection in your instance ${InstanceName} is unstable. Please monitor it and assess any potential impact on the applications running on this instance. |
| Network interface card overheating | NicHighTemp | Warning | The network interface card in your instance ${InstanceName} is overheating. Please monitor it and assess any potential impact on this instance's applications. |
| Network interface card adapter disconnected | NicAdRtDis(H800) | Warning | The network interface card adapter in your instance ${InstanceName} has disconnected. Please monitor it and assess any potential impact on this instance's applications. |
| IP address of the network interface card duplicated | NicNonSingleIpErr(SCI) | Warning | The IP address of the network interface card in your instance ${InstanceName} is duplicated. Please monitor it and assess any potential impact on this instance's applications. |
| Machine overheating | UpperCriticalgoinghigh | Warning | The machine hosting your instance ${InstanceName} is overheating. Please monitor it and assess any potential impact on this instance's applications. |
| Fan speed extremely low | LowerCriticalgoinglow | Warning | The fan speed on your instance ${InstanceName} is exceptionally low. Please monitor this closely as it may affect the application running on this instance. |
| Fan speed relatively low | LowerNon-criticalgoinglow | Warning | The fan speed on your instance ${InstanceName} is somewhat low. Please be cautious as it could impact the application running on this instance. |
| Mainboard fault unrecoverable | TransitiontoNon-recoverablefromlesssever | Warning | The mainboard issue on your instance ${InstanceName} is irreparable. Please take note of this as it may affect the application running on this instance. |
| Mainboard system firmware error | SystemFirmwareError | Warning | Your instance ${InstanceName} is experiencing a system firmware error on the mainboard. Please monitor this as it could affect the application running on this instance. |
| Mainboard diagnostics interrupted | NMI/DiagInterrupt | Warning | The diagnostic process for the mainboard on your instance ${InstanceName} has been interrupted. Please be aware of this as it might impact the application running on this instance. |
| Mainboard status downgraded | TransitiontoDegraded | Warning | The status of the mainboard on your instance ${InstanceName} has deteriorated. Please keep this in mind as it could affect the application running on this instance. |
| Hard disk backplane drive error | DriveFault | Warning | Your instance ${InstanceName} is encountering a hard disk backplane drive error. Please pay attention to this as it may affect the application operating on this instance. |
| PCIe link slowdown | LaneDrop | Warning | The PCIe link on your instance ${InstanceName} has slowed down. Please be aware of this as it might impact the application running on this instance. |
| PCIe link bandwidth decreased | BWDrop | Warning | The PCIe link bandwidth on your instance ${InstanceName} has reduced. Please monitor this as it may affect the application running on this instance. |
| Hard disk medium error | MediaError | Warning | Your instance ${InstanceName} is experiencing a hard disk medium error. Please pay attention to this as it could affect the application running on this instance. |
| Hard disk IO error | IOError | Warning | Your instance ${InstanceName} is encountering a hard disk IO error. Please be cautious as this might impact the application running on this instance. |
| NVME cannot read SMART information | NvmeSmartFail_None | Warning | The NVME drive on your instance ${InstanceName} is unable to access SMART information. Please be aware of this as it may affect the application running on this instance. |
| GPU xid13 error | Xid13 | Warning | Your instance ${InstanceName} is experiencing a GPU xid13 error. Please observe this issue carefully as it might impact the application running on this instance. |
| GPU xid31 error | Xid31 | Warning | Your instance ${InstanceName} is encountering a GPU xid31 error. Please take note of this as it could affect the application running on this instance. |
| GPU xid32 error | Xid32 | Warning | Your instance ${InstanceName} has a GPU xid32 error. Please monitor this closely as it might impact the application running on this instance. |
| GPU xid43 error | Xid43 | Warning | Your instance ${InstanceName} is experiencing a GPU xid43 error. Please pay attention to this as it might affect the application running on this instance. |
| GPU xid45 error | Xid45 | Warning | Your instance ${InstanceName} is encountering a GPU xid45 error. Please be aware of this as it may influence the application running on this instance. |
| GPU xid48 error | Xid48 | Warning | Your instance ${InstanceName} has a GPU xid48 error. Please monitor this issue as it could affect the application running on this instance. |
| GPU xid63 error | Xid63 | Warning | Your instance ${InstanceName} has a GPU xid63 error. Please observe this carefully, as it might impact the application running on this instance. |
| GPU ECC error | EccErr | Warning | Your instance ${InstanceName} is experiencing a GPU ECC error. Please keep an eye on this issue as it could affect the application running on this instance. |
| GPU xid74 error | Xid74 | Warning | Your instance ${InstanceName} has encountered a GPU xid74 error. Please take note and consider the potential impact on the application running on this instance. |
| GPU xid79 error | Xid79 | Warning | Your instance ${InstanceName} has encountered a GPU xid79 error. Please take note and consider the potential impact on the application running on this instance. |
| CPU cache read/write error | CPUCacheErr | Warning | Your instance ${InstanceName} has a CPU cache read/write error. Please take note and consider the potential impact on the application running on this instance. |
| CPU frequency decreased | Throttled | Warning | The CPU frequency of your instance ${InstanceName} has dropped. Please take note and consider the potential impact on the application running on this instance. |
| Network interface card connection abnormal | NicLinkDown | Warning | The network interface card of your instance ${InstanceName} is experiencing issues. Please take note and consider the potential impact on the application running on this instance. |
| Network interface card slowdown | NicSpeedLow | Warning | The speed of the network interface card on your instance ${InstanceName} has slowed down. Please take note and consider the potential impact on the application running on this instance. |
| Network interface card bus error _ bandwidth decreased | NicBusErr_BW | Warning | Your instance ${InstanceName} has encountered a network interface card bus error, leading to decreased bandwidth. Please take note and consider the potential impact on the application running on this instance. |
| Network interface card bus error _ slowdown | NicBusErr_SP | Warning | Your instance ${InstanceName} has encountered a network interface card bus error, causing a slowdown. Please take note and consider the potential impact on the application running on this instance. |
| Network port CRC error | NICCRC | Warning | Your instance ${InstanceName} has a network port CRC error. Please take note and consider the potential impact on the application running on this instance. |
| Low output power of optical module | tx_power_is_low_alarm | Warning | The output power of the optical module in your instance ${InstanceName} is low. Please take note and consider the potential impact on the application running on this instance. |
| Low current of optical module | bias_cur_is_low_alarm | Warning | The current of the optical module in your instance ${InstanceName} is low. Please take note and consider the potential impact on the application running on this instance. |
| Low input power of optical module | rx_power_is_low_alarm | Warning | The input power of the optical module in your instance ${InstanceName} is low. Please take note and consider the potential impact on the application running on this instance. |
| Memory error | DIMMCE | Warning | Your instance ${InstanceName} has encountered a memory error. Please take note and consider the potential impact on the application running on this instance. |
| Too many recoverable memory errors | CorrectableECC | Warning | Your instance ${InstanceName} has too many recoverable memory errors. Please take note and consider the potential impact on the application running on this instance. |
Baidu edge computing (BEC)
| Event name in Chinese | Event type | Event Level | Solutions and suggestions |
|---|---|---|---|
| Several hard disks damaged | RepairTooMany | CRITICAL | Several hard drives on your instance ${InstanceName} are damaged. Please authorize the repair promptly. |
| Hard disk head or circuit error | RepairHardwareError | CRITICAL | Your instance ${InstanceName} has a hard disk head or circuitry issue. Please authorize the repair promptly. |
| Hard disk missing | RepairMissing | CRITICAL | The hard disk for your instance ${InstanceName} is missing. Please authorize the repair promptly. |
| Hard disk drive letter unreadable | RepairNotReady | CRITICAL | The drive letter for the hard disk on your instance ${InstanceName} is unreadable. Please authorize the repair promptly. |
| NVMe controller abnormal | RepairNvmeCritalErr | CRITICAL | The NVMe controller on your instance ${InstanceName} is not functioning properly. Please authorize the repair promptly. |
| NVMe medium abnormal | RepairNvmeMediaErr | CRITICAL | The NVMe medium on your instance ${InstanceName} is malfunctioning. Please authorize the repair promptly. |
| NVMe disk missing | RepairNmveMissing | CRITICAL | The NVMe disk for your instance ${InstanceName} is missing. Please authorize the repair promptly. |
| GPU overheating | RepairGpuHighTemperature | CRITICAL | The GPU temperature on your instance ${InstanceName} is too high. Please authorize the repair promptly. |
| GPU drop out | RepairGpuMissing | CRITICAL | The GPU on your instance ${InstanceName} is missing. Please authorize the repair promptly. |
| Physical disk prediction error count exceeds limit | RepairRaidPdPreErr | CRITICAL | The predicted error count for the physical disk of your instance ${InstanceName} has exceeded the limit. Please authorize the repair promptly. |
| Other error count of physical disk exceeds limit | RepairRaidPdOtherErr | CRITICAL | The count of other physical disk errors on your instance ${InstanceName} has exceeded the limit. Please authorize the repair promptly. |
| Physical disk configuration error | RepairRaidPdUB | CRITICAL | Your instance ${InstanceName} has a configuration issue with its physical disk. Please authorize the repair promptly. |
| Physical disk fault | RepairRaidPdFailed | CRITICAL | Your instance ${InstanceName} has a physical disk failure. Please authorize the repair promptly. |
| Physical disk missing | RepairRaidPdMissing | CRITICAL | The physical disk of your instance ${InstanceName} is missing. Please authorize the repair promptly. |
| Unrecognizable physical disk | RepairRaidPdOffline | CRITICAL | The physical disk of your instance ${InstanceName} is unrecognizable. Please authorize the repair promptly. |
| CPU configuration error | RepairConfigurationError | CRITICAL | Your instance ${InstanceName} has a CPU configuration error. Please authorize the repair promptly. |
| CPU internal error | RepairIERR | CRITICAL | Your instance ${InstanceName} has an internal CPU error. Please authorize the repair promptly. |
| BIOS self-check error | RepairFRB1/BISTfailure | CRITICAL | Your instance ${InstanceName} has encountered a BIOS self-check error. Please authorize the repair promptly. |
| CPU root node error | RepairSMBIOSUncorrectableCPU-complexError | CRITICAL | Your instance ${InstanceName} has a CPU root node error. Please authorize the repair as soon as possible. |
| CPUMCE fault | RepairUncorrectablemachinecheckexception | CRITICAL | Your instance ${InstanceName} has encountered a CPUMCE fault. Please authorize the repair as soon as possible. |
| CPU detection error | RepairCorrectablemachinecheckerror | CRITICAL | Your instance ${InstanceName} has a CPU detection error. Please authorize the repair promptly. |
| Memory fault | RepairDIMMUE | CRITICAL | Your instance ${InstanceName} has encountered a memory fault. Please authorize the repair promptly. |
| Memory ECC error | RepairUncorrectableECC | CRITICAL | Your instance ${InstanceName} has a memory ECC error. Please authorize the repair promptly. |
| Memory cleanup failure | RepairMemoryScrubFailed | CRITICAL | The memory cleanup operation for your instance ${InstanceName} failed. Please authorize the repair promptly. |
| Memory parity error | RepairParity | CRITICAL | Your instance ${InstanceName} has a memory parity error. Please authorize the repair promptly. |
| Fan position error | RepairDeviceAbsent | CRITICAL | Your instance ${InstanceName} has a fan position error. Please authorize the repair promptly. |
| Uncorrectable bus error | RepairBusUncorrectableerror | CRITICAL | Your instance ${InstanceName} has encountered an uncorrectable bus error. Please authorize the repair promptly. |
| Bus critical error | RepairBusFatalError | CRITICAL | Your instance ${InstanceName} has a critical bus error. Please authorize the repair promptly. |
| PCI parity error | RepairPCIPERR | CRITICAL | Your instance ${InstanceName} has encountered a PCI parity error; please approve the repair promptly. |
| System hardware fault | RepairUndeterminedsystemhardwarefailure | CRITICAL | Your instance ${InstanceName} has a system hardware issue; please approve the repair promptly. |
| Mainboard fault | RepairUnrecoverablesystem-boardfailure | CRITICAL | Your instance ${InstanceName} has encountered a mainboard issue; please approve the repair promptly. |
| Memory fault | RepairUnrecoverablevideocontrollerfailure | CRITICAL | Your instance ${InstanceName} has encountered a memory issue; please approve the repair promptly. |
| Failure to read GPU configuration space | RepairPciHeader_None | CRITICAL | The GPU configuration space of your instance ${InstanceName} cannot be read; please approve the repair promptly. |
| Hard disk reached end of life | RepairSSDWearOut | CRITICAL | The hard drive in your instance ${InstanceName} has reached the end of its lifespan; please approve the repair promptly. |
| PCIe link slowdown | LaneDrop | WARNING | The PCIe link of your instance ${InstanceName} has slowed down. Please take note and consider the potential impact on the application running on this instance. |
| PCIe link bandwidth decreased | BWDrop | WARNING | The PCIe link bandwidth of your instance ${InstanceName} has decreased. Please take note and consider the potential impact on the application running on this instance. |
| Hard disk medium error | MediaError | WARNING | Your instance ${InstanceName} is experiencing a hard disk medium error. Please pay attention to this as it could affect the application running on this instance. |
| Hard disk IO error | IOError | WARNING | Your instance ${InstanceName} is encountering a hard disk IO error. Please be cautious as this might impact the application running on this instance. |
| NVME cannot read SMART information | NvmeSmartFail_None | WARNING | The NVME drive on your instance ${InstanceName} is unable to access SMART information. Please be aware of this as it may affect the application running on this instance. |
| NVME write/erase cycles worn out | NvmeWearOut | WARNING | The NVME write/erase cycles in your instance ${InstanceName} are worn out. Please take note and consider the potential impact on the application running on this instance. |
| CPU cache read/write error | CPUCacheErr | WARNING | Your instance ${InstanceName} has a CPU cache read/write error. Please take note and consider the potential impact on the application running on this instance. |
| Network interface card connection abnormal | NicLinkDown | WARNING | The network interface card of your instance ${InstanceName} is experiencing issues. Please take note and consider the potential impact on the application running on this instance. |
| Network interface card slowdown | NicSpeedLow | WARNING | The speed of the network interface card on your instance ${InstanceName} has slowed down. Please take note and consider the potential impact on the application running on this instance. |
| Network interface card bus error _ bandwidth decreased | NicBusErr_BW | WARNING | Your instance ${InstanceName} has encountered a network interface card bus error, leading to decreased bandwidth. Please take note and consider the potential impact on the application running on this instance. |
| Network interface card bus error _ slowdown | NicBusErr_SP | WARNING | Your instance ${InstanceName} has encountered a network interface card bus error, causing a slowdown. Please take note and consider the potential impact on the application running on this instance. |
| Network port CRC error | NICCRC | WARNING | Your instance ${InstanceName} has a network port CRC error. Please take note and consider the potential impact on the application running on this instance. |
| Low output power of optical module | tx_power_is_low_alarm | WARNING | The output power of the optical module in your instance ${InstanceName} is low. Please take note and consider the potential impact on the application running on this instance. |
| Low current of optical module | bias_cur_is_low_alarm | WARNING | The current of the optical module in your instance ${InstanceName} is low. Please take note and consider the potential impact on the application running on this instance. |
| Low input power of optical module | rx_power_is_low_alarm | WARNING | The input power of the optical module in your instance ${InstanceName} is low. Please take note and consider the potential impact on the application running on this instance. |
EIP
| Event name in Chinese | Event type | Event Level | Solutions and suggestions |
|---|---|---|---|
| Dial testing abnormal | AbnormalDialingTest | NOTICE | Contact the staff |
Dedicated gateway
| Event name in Chinese | Event type | Event Level | Solutions and suggestions |
|---|---|---|---|
| Dedicated gateway unavailable | LinkProbeInavailable | Failure | Was the link detection issue on this dedicated gateway in line with your expectations? If the dedicated line issue was unexpected, we hope you can promptly evaluate the application situation and identify the root cause. Baidu engineers will concurrently inspect relevant functions on the Baidu side to ensure your application runs smoothly. |
| Dedicated gateway available | LinkProbeAvailable | Notification | The dedicated gateway is operating normally. |
VPN gateway
| Event name in Chinese | Event type | Event Level | Solutions and suggestions |
|---|---|---|---|
| IPSEC tunnel abnormal | IPSEC_DOWN | CRITICAL | The IPsec VPN tunnel negotiation has encountered an issue. Please refer to the FAQ to identify potential causes for this abnormality. |
| IPSEC tunnel restored | IPSEC_UP | NOTICE | The IPsec VPN tunnel connection has been successfully restored. Thank you for using our service. |
| SSLVPN service abnormal | SSLVPN_DOWN | CRITICAL | The SSL VPN service is experiencing an issue, and Baidu AI Cloud is actively investigating the root cause. |
| SSLVPN service restored | SSLVPN_UP | NOTICE | The SSL VPN service has been successfully restored. Thank you for using our service. |
| The number of SSLVPN connections is close to the quota | SSLVPN_FULL | WARNING | The number of SSL VPN client connections has reached 80% of your purchased quota. Please evaluate your requirements and expand capacity as needed. |
| GRE tunnel disconnected | GRE_DOWN | CRITICAL | The GRE tunnel is currently experiencing an issue. Baidu AI Cloud is investigating the cause. |
| GRE tunnel restored | GRE_UP | NOTICE | The GRE tunnel connectivity has been successfully restored. Thank you for using our service. |
| BGP Peer disconnected | BGP_DOWN | CRITICAL | The BGP Peer status is experiencing an issue. Please refer to the FAQ to identify potential causes for this abnormality. |
| BGP Peer connection established | BGP_ESTABLISH | NOTICE | The BGP Peer connection has been successfully established. Thank you for using our service. |
NAT gateway
| Event name in Chinese | Event type | Event Level | Solutions and suggestions |
|---|---|---|---|
| NAT service restored after an anomaly | NAT_HA_SUC | CRITICAL | The NAT service has been restored, thank you for your use |
| NAT service abnormal | NAT_HA_FAIL | CRITICAL | The NAT service is abnormal, Baidu AI Cloud is working hard to fix it |
Dedicated channel
| Event name in Chinese | Event type | Event Level | Solutions and suggestions |
|---|---|---|---|
| The IPV4 BGP status of the dedicated channel is DOWN | IPV4BGPStatusDown | Failure | The IPV4_BGP status of this dedicated channel is currently DOWN. Please promptly assess the application situation and determine the cause. |
| The IPV4 BGP status of the dedicated channel is UP | IPV4BGPStatusUp | Notification | Please be informed that the IPV4_BGP status of this dedicated channel is now UP. |
| The IPV4_BFD status of the dedicated channel is DOWN | IPV4BFDStatusDown | Failure | The IPV4_BFD status of this dedicated channel is currently DOWN. Please promptly assess the application situation and determine the cause. |
| The IPV4_BFD status of the dedicated channel is UP | IPV4BFDStatusUp | Notification | Please be informed that the IPV4_BFD status of this dedicated channel is now UP. |
| BGP route over-limit alert | ChannelRouteLimitWarning | Failure | The number of BGP routes on this dedicated channel has exceeded 75% of the threshold. If a BGP route over-limit alert was unexpected, we recommend promptly evaluating the application situation and investigating the cause. Baidu engineers will concurrently inspect relevant functions on the Baidu side to ensure your application continues to operate smoothly. |
| BGP route over-limit error | ChannelRouteLimitFault | Failure | The number of BGP routes on this dedicated channel has exceeded the limit, and additional routes are no longer being recorded. Please evaluate your application situation promptly and investigate the cause. Baidu engineers will concurrently inspect relevant functions on the Baidu side to ensure smooth operation of your application. |
| Number of BGP routes restored | ChannelRouteLimitRecover | Notification | The number of BGP routes on this dedicated channel has been restored |
| The IPV6_BGP status of the dedicated channel is DOWN | IPV6BGPStatusDown | Failure | The IPV6_BGP status of this dedicated channel is currently DOWN. Please promptly assess the application situation and determine the cause. |
| The IPV6_BGP status of the dedicated channel is UP | IPV6BGPStatusUp | Notification | Please be informed that the IPV6_BGP status of this dedicated channel is now UP. |
| The IPV6_BFD status of the dedicated channel is DOWN | IPV6BFDStatusDown | Failure | The IPV6_BFD status of this dedicated channel is currently DOWN. Please promptly assess the application situation and determine the cause. |
| The IPV6_BFD status of the dedicated channel is UP | IPV6BFDStatusUp | Notification | Please be informed that the IPV6_BFD status of this dedicated channel is currently UP. |
Physical dedicated line
| Event name in Chinese | Event type | Event Level | Solutions and suggestions |
|---|---|---|---|
| Traffic level alert of physical dedicated line | TrafficCongestionFault | Notification | Has the traffic level of this physical dedicated line met your expectations? If there is an unexpected traffic level alert for the dedicated line, we kindly ask you to evaluate the application status promptly and identify any potential issues. Baidu engineers will simultaneously inspect relevant functionalities on Baidu's side to ensure the smooth operation of your application. |
| Traffic level of physical dedicated line restored | TrafficCongestionRecover | Notification | The traffic level of the physical dedicated line has been successfully restored. |
| EVR full-machine fault | EVR_FAULT | Failure | The access device on Baidu's side, associated with this physical dedicated line, has encountered a complete machine failure. Baidu engineers are actively investigating the cause. Please keep an eye on the situation for updates. |
DDoS basic protection/Traffic burst service package (TBSP)
| Event name in Chinese | Event type | Event Level | Solutions and suggestions |
|---|---|---|---|
| EIP attack event | ddos_event | WARNING | The DDoS attack has been detected to have ended, please log in to the console in a timely manner for handling |
| EIP ban event | pause_event | WARNING | Your instance bandwidth has exceeded the DDoS protection threshold, and your service will be processed by ${operation} temporarily |
MapReduce BMR
| Event name in Chinese | Event type | Event Level | Solutions and suggestions |
|---|---|---|---|
| Host crashing | BMR_INSTANCE_EVENT_DOWN | CRITICAL | Please contact the service staff for handling |
| Host crashing eliminated | BMR_INSTANCE_EVENT_UP | NOTICE | You are kindly informed |
| Component crashing | BMR_COMPONENT_EVENT_DOWN | CRITICAL | Please contact the service staff for handling |
| Component crashing eliminated | BMR_COMPONENT_EVENT_UP | NOTICE | You are kindly informed |
| BMR-Agent disconnected | BMR_AGENT_EVENT_UNCONNECT | CRITICAL | Please contact the service staff for handling |
| BMR-Agent connection restored | BMR_AGENT_EVENT_CONNECT | NOTICE | You are kindly informed |
| Active-Master switch occurred | BMR_CLUSTER_ACTIVE_MASTER_CHANGE | WARNING | Please contact the service staff for handling |
Enterprise data lake management and analysis platform (EDAP)
| Event name in Chinese | Event type | Event Level | Solutions and suggestions |
|---|---|---|---|
| EDAP job group success notification | EdapJobSuccessStatus | Notification | EDAP job group success notification: The monitor has detected that the job group ${flowName} in your project ${projectName} triggered a [${ruleStatus}] notification at ${time}. |
| EDAP off-line job success notification | EdapBatchJobSuccessStatus | Notification | EDAP offline job success notification: The monitor has detected that the offline job ${flowName} in your project ${projectName} triggered a [${ruleStatus}] notification at ${time}. |
| EDAP job group node failure alert | EdapJobErrorStatus | Warning | EDAP job monitoring function: The monitor has detected that the job node ${jobName} within the job group ${flowName} in your project ${projectName} triggered a [${ruleStatus}] monitoring alert at ${time}. Please address this issue promptly. |
| EDAP offline job failure alert | EdapBatchJobErrorStatus | Warning | The EDAP job monitoring function, the monitor has detected that the offline job ${jobName} in your project ${projectName} triggered the [${ruleStatus}] monitoring alert at ${time}, please handle it promptly |
| EDAP real-time job failure alert | EdapRealtimeJobErrorStatus | Warning | EDAP job monitoring function, the monitor has detected that the real-time job ${jobName} in your project ${projectName} triggered the [${ruleStatus}] monitoring alert at ${time}, please handle it promptly |
| The EDAP job group did not trigger an alert | EdapJobNotFireStatus | Warning | The EDAP job monitoring function, the monitor has detected that the offline job ${jobName} in your project ${projectName} triggered the [${ruleStatus}] monitoring alert at ${time}, please handle it promptly |
| The EDAP offline job did not trigger an alert | EdapBatchJobNotFireStatus | Warning | EDAP job monitoring function, the monitor has detected that the offline job ${jobName} in your project ${projectName} triggered the [${ruleStatus}] monitoring alert at ${time}, please handle it promptly |
| EDAP job group start time timeout alert | EdapJobStarttimeoutStatus | Warning | EDAP job group monitoring function: The monitor has detected that the job group ${flowName} in your project ${projectName} triggered a [${ruleStatus}] monitoring alert and failed to start at [${StratTime}]. Please handle this issue promptly. |
| EDAP job group node runtime timeout alert | EdapJobRunningtimeoutStatus | Warning | EDAP job monitoring function: The monitor has detected that the job node ${jobName} within the job group ${flowName} in your project ${projectName} triggered a [${ruleStatus}] monitoring alert at ${time}. The job operation's start time was [${StartTime}]; the alert duration upper limit is set to [${TimeOut}] minutes, and this alert has been active for [${RunningTime}] minutes. Please address this issue promptly. |
| EDAP integration job success notification | EdapJobIntegrationSuccessStatus | Notification | EDAP integration job success notification: The monitor has detected that the integration job ${jobName} in your project ${projectName} triggered a [${ruleStatus}] notification at ${time}. |
| EDAP integration job failure alert | EdapJobIntegrationErrorStatus | Warning | EDAP integration job monitoring function: The monitor has detected that the integration job {jobName}.${sourceTableName} in your project ${projectName} triggered a [${ruleStatus}] monitoring alert at ${time}. Please address this issue promptly. |
| EDAP data quality alert | EdapQualityStatus | Notification | EDAP data quality function: The monitor has detected that the data quality task ${jobName} in your project ${projectName} triggered a [${ruleStatus}] monitoring alert at ${time}. Please address this issue promptly. |
| EDAP metadata change notification | EDAPMetaDataChangeStatus | Notification | EDAP metadata change notification, the monitor has detected that the data table of ${data source link}/${library}/${table} triggered a metadata change notification at ${time} by ${name}, and the modified content is ${content}. Please handle it promptly |
| EDAP metadata deletion notification | EDAPMetaDataDeletionStatus | Notification | EDAP metadata deletion notification: The monitor has detected that the metadata deletion notification for ${data source link}/${base}/${table} was triggered by ${name} at ${time}. Please review it promptly. |
| EDAP off-line job scheduling configuration change notification | EdapBatchJobSchedulerChangeStatus | Notification | EDAP offline job scheduling configuration change notification: The monitor has detected that the job group ${flowName} in your project ${projectName} triggered a [${ruleStatus}] notification at ${time}. |
| EDAP job group scheduling configuration change notification | EdapJobSchedulerChangeStatus | Notification | EDAP job group scheduling configuration change notification: The monitor has detected that the offline job ${flowName} in your project ${projectName} triggered a [${ruleStatus}] notification at ${time}. |
Cloud database dedicated cluster (DDC)
| Event name in Chinese | Event type | Event Level | Solutions and suggestions |
|---|---|---|---|
| The instance resizing initiated by the user failed | resizeFail | WARNING | Did the database service meet your expectations following the failed instance resizing? We kindly ask you to assess the application status and identify any potential issues promptly. Baidu engineers will simultaneously inspect relevant functionalities on Baidu's side to ensure the smooth operation of your application. |
| The instance resizing initiated by the user has begun | resizeStart | NOTICE | Did the recent instance resizing operation meet your expectations? If it was unexpected, we kindly ask you to assess the application status and identify any potential issues promptly. Baidu engineers will simultaneously inspect relevant functionalities on Baidu's side to ensure the smooth operation of your application. |
| The instance resizing initiated by the user succeeded | resizeSuccess | NOTICE | Did the database service meet your expectations after the instance resizing was successfully completed? We kindly ask you to assess the application status promptly. Baidu engineers will simultaneously inspect relevant functionalities on Baidu's side to ensure the smooth operation of your application. |
| The instance reboot initiated by the user failed | restartFail | WARNING | Did the database service meet your expectations following the failed instance reboot? We kindly ask you to assess the application status, identify any potential issues promptly, and address them as needed. Baidu engineers will simultaneously inspect relevant functionalities on Baidu's side to ensure the smooth operation of your application. |
| The instance reboot initiated by the user has begun | restartStart | NOTICE | Did the recent instance reboot operation meet your expectations? If it was unexpected, we kindly ask you to assess the application status, identify any potential issues promptly, and address them as needed. Baidu engineers will simultaneously inspect relevant functionalities on Baidu's side to ensure the smooth operation of your application. |
| The instance reboot initiated by the user succeeded | restartSuccess | NOTICE | Did the database service meet your expectations after the successful instance reboot? We hope you can evaluate the application status promptly. Meanwhile, Baidu engineers will carry out checks on relevant functions to ensure your application runs smoothly. |
| The primary-standby switch initiated by the user failed | switchOverFail | WARNING | Did the database service meet your expectations following the failure of the primary-standby switch? We hope you can examine the application status promptly and pinpoint the fault. Meanwhile, Baidu engineers will carry out checks on relevant functions to ensure your application operates without issues. |
| The primary-standby switch initiated by the user has begun | switchOverStart | NOTICE | Did the operation of the primary-standby switch meet your expectations? If the operation was unexpected, we hope you can quickly assess the application status and identify any faults. Meanwhile, Baidu engineers will carry out checks on relevant functions to ensure uninterrupted operation of your application. |
| The primary-standby switch initiated by the user succeeded | switchOverSuccess | NOTICE | Did the database service meet your expectations after the successful instance reboot? We hope you can evaluate the application status promptly. Meanwhile, Baidu engineers will carry out checks on relevant functions to ensure your application runs smoothly. |
Baidu container instance (BCI)
| Event name in Chinese | Event type | Event Level | Solutions and suggestions |
|---|---|---|---|
| Notifications | BCINotice | Notification | - |
| Alert event | BCIWarning | Warning | - |
Cloud container engine (CCE)
| Chinese name of the event | Event type | Event level | Solutions and suggestions | Remarks |
|---|---|---|---|---|
| CCE cluster abnormal event | CCE_ABNORMAL_EVENT | WARNING | Please log in to the CCE cluster to solve the problem | - |
| CCE cluster node NotReady | CCE_NODE_NOT_READY | CRITICAL | Please log in to the CCE cluster to solve the problem | - |
| CCE Pod abnormal event | CCE_POD_ABNORMAL_EVENT | WARNING | Please log in to the CCE cluster to solve the problem | Pod abnormal status generally refers to ImagePullBackOff, CrashLoopBackOff or PodFailed |
| CCE node abnormal event | CCE_NODE_ABNORMAL_EVENT | WARNING | Please log in to the CCE cluster to solve the problem | - |
| CCE cluster node group failed to scale down | ScaleDownFailed | WARNING | Your node group ${NodegroupName} failed to scale down. Please log in to the CCE cluster to solve the problem | - |
| CCE cluster node group failed to scale up | FailedToScaleUpGroup | WARNING | Your node group ${NodegroupName} failed to scale down. Please log in to the CCE cluster to solve the problem | - |
Cloud Database SCS for Redis
| Event name in Chinese | Event type | Event Level | Solutions and suggestions |
|---|---|---|---|
| Fault switch has begun | failOverStart | Notification | Did the database service meet your expectations after the primary-standby switch? If the fault was unexpected, we hope you can evaluate the application promptly and identify the source of the issue. Meanwhile, Baidu engineers will carry out checks on relevant functions to ensure the smooth execution of your application. |
| Fault switch succeeded | failOverSuccess | Notification | Did the database service meet your expectations after the primary-standby switch? If the fault was unexpected, we hope you can evaluate the application promptly and identify the source of the issue. Meanwhile, Baidu engineers will carry out checks on relevant functions to ensure the smooth execution of your application. |
| The specification change has begun | SpecificationChangesStart | Notification | Does this change in node specifications meet your expectations? If the specification change was unexpected, we hope you can swiftly assess the application status and identify the source of the issue. Baidu engineers will also check relevant functions to maintain your application's stable performance. |
| The specification change succeeded | SpecificationChangesSuccess | Notification | Does the database service meet your expectations following the successful change in node specifications? If the change was unexpected, we hope you can promptly evaluate the application status and locate the fault. Simultaneously, Baidu engineers will verify relevant functions to ensure seamless application performance. |
| The specification change failed | SpecificationChangesFailed | Notification | The change in node specifications failed, and we hope you can promptly assess the application status and locate the fault. Simultaneously, Baidu engineers will verify relevant functions to ensure that your application performs smoothly and reliably. |
| The instance reboot has begun | restartStart | Notification | The instance reboot process has started. Please promptly assess the status of your application. Baidu engineers will simultaneously verify relevant functions to guarantee your application's smooth operation. |
| The instance reboot succeeded. | restartSuccess | Notification | The instance reboot completed successfully. Please assess your application status promptly. At the same time, Baidu engineers will conduct checks on relevant functions to ensure your application runs without issues. |
| The instance reboot failed | restartFail | Warning | The instance reboot failed. Please evaluate your application status promptly. Meanwhile, Baidu engineers will carry out checks on relevant functions to ensure smooth operation of your application. |
Cloud database RDS
| Event name in Chinese | Event type | Event Level | Solutions and suggestions |
|---|---|---|---|
| Fault switch has begun | failOverStart | Notification | Did the database service meet your expectations after the primary-standby switch? If the fault was unexpected, we hope you can evaluate the application promptly and identify the source of the issue. Meanwhile, Baidu engineers will carry out checks on relevant functions to ensure the smooth execution of your application. |
| Fault switch succeeded | failOverSuccess | Notification | Did the database service meet your expectations after the primary-standby switch? If the fault was unexpected, we hope you can evaluate the application promptly and identify the source of the issue. Meanwhile, Baidu engineers will carry out checks on relevant functions to ensure the smooth execution of your application. |
| Fault switch failed | failOverFailed | Warning | Did the database service meet your expectations after the primary-standby switch? If the fault was unexpected, we hope you can evaluate the application promptly and identify the source of the issue. Meanwhile, Baidu engineers will carry out checks on relevant functions to ensure the smooth execution of your application. |
| The primary-standby switch has begun | switchOverStart | Notification | Did the database service meet your expectations after the primary-standby switch? If the fault was unexpected, we hope you can evaluate the application promptly and identify the source of the issue. Meanwhile, Baidu engineers will carry out checks on relevant functions to ensure the smooth execution of your application. |
| The primary-standby switch succeeded | switchOverSuccess | Notification | Did the database service meet your expectations after the primary-standby switch? If the fault was unexpected, we hope you can evaluate the application promptly and identify the source of the issue. Meanwhile, Baidu engineers will carry out checks on relevant functions to ensure the smooth execution of your application. |
| The primary-standby switch failed | switchOverFailed | Warning | Did the database service meet your expectations after the primary-standby switch? If the fault was unexpected, we hope you can evaluate the application promptly and identify the source of the issue. Meanwhile, Baidu engineers will carry out checks on relevant functions to ensure the smooth execution of your application. |
| The instance resizing has begun | SpecificationChangesStart | Notification | Did the database service meet your expectations following the instance resizing? If the resizing resulted in an unexpected fault, we hope you can promptly evaluate the application status and pinpoint the source of the problem. Simultaneously, Baidu engineers will conduct checks on relevant functions to guarantee consistent application performance. |
| The instance resizing succeeded | SpecificationChangesSuccess | Notification | Did the database service meet your expectations following the instance resizing? If the resizing resulted in an unexpected fault, we hope you can promptly evaluate the application status and pinpoint the source of the problem. Simultaneously, Baidu engineers will conduct checks on relevant functions to guarantee consistent application performance. |
| The instance resizing failed | SpecificationChangesFailed | Warning | Did the database service meet your expectations following the instance resizing? If the resizing resulted in an unexpected fault, we hope you can promptly evaluate the application status and pinpoint the source of the problem. Simultaneously, Baidu engineers will conduct checks on relevant functions to guarantee consistent application performance. |
| The instance cloning has begun | CloneStart | Notification | Did the database service meet your expectations after the instance cloning? If the instance cloning was unexpected, we hope you can quickly assess the application status and identify any faults. Meanwhile, Baidu engineers will conduct checks on relevant functions to ensure stable application operation. |
| The instance cloning succeeded | CloneSuccess | Notification | Did the database service meet your expectations after the instance cloning? If the instance cloning was unexpected, we hope you can quickly assess the application status and identify any faults. Meanwhile, Baidu engineers will conduct checks on relevant functions to ensure stable application operation. |
| The data recovery of the instance cloning failed | CloneDataRecoveryFailed | Warning | The data recovery for instance cloning failed, and manual intervention is underway. If the cloning failure was unexpected, we encourage you to promptly evaluate the application status and identify the issue. Meanwhile, Baidu engineers will verify relevant functions to ensure continuous application performance. |
| The instance cloning resources are insufficient | CloneNoAvailableResource | Warning | Instance cloning failed due to insufficient resources. If this failure was unexpected, we urge you to promptly assess the application status and locate the fault. Baidu engineers will simultaneously carry out checks on relevant functions to ensure your application's stability. |
| The instance reboot has begun | restartStart | Notification | Did the database service meet your expectations after the instance reboot? If the reboot was unexpected, we hope you can quickly evaluate the application status and identify the fault. Meanwhile, Baidu engineers will check related functions to guarantee your application's smooth performance. |
| The instance reboot succeeded. | restartSuccess | Notification | Did the database service meet your expectations after the instance reboot? If the reboot was unexpected, we hope you can quickly evaluate the application status and identify the fault. Meanwhile, Baidu engineers will check related functions to guarantee your application's smooth performance. |
| The instance reboot failed | restartFail | Warning | Did the database service meet your expectations after the instance reboot? If the reboot was unexpected, we hope you can quickly evaluate the application status and identify the fault. Meanwhile, Baidu engineers will check related functions to guarantee your application's smooth performance. |
| Fault injection has begun | failInjectStart | Notification | Did the database service meet your expectations following the fault injection? If the fault injection was unexpected, we hope you can promptly examine the application status and pinpoint the issue. Simultaneously, Baidu engineers will verify related functions to ensure consistent application performance. |
| Fault injection succeeded | failInjectSuccess | Notification | Did the database service meet your expectations after the instance fault injection? If the fault injection was unexpected, we urge you to promptly evaluate the application status and identify the issue. Meanwhile, Baidu engineers will check related functions to maintain seamless application operation. |
| Fault injection failed | failInjectFail | Warning | Did the database service meet your expectations after the instance fault injection? If the fault injection was unexpected, we urge you to promptly evaluate the application status and identify the issue. Meanwhile, Baidu engineers will check related functions to maintain seamless application operation. |
| Account creation has begun | createAccountStart | Notification | Did the database service meet your expectations after the account creation? If an unexpected fault occurred, we hope you will promptly assess the application status and locate the fault. Baidu engineers will simultaneously conduct checks on related functions to ensure stable application performance. |
| Account creation succeeded | createAccountSucc | Notification | Did the database service meet your expectations after the account creation? If an unexpected fault occurred, we hope you will promptly assess the application status and locate the fault. Baidu engineers will simultaneously conduct checks on related functions to ensure stable application performance. |
| Account creation failed | createAccountFail | Warning | Did the database service meet your expectations after the account creation? If an unexpected fault occurred, we hope you will promptly assess the application status and locate the fault. Baidu engineers will simultaneously conduct checks on related functions to ensure stable application performance. |
| Launching of public network has begun | openEipStart | Notification | Did the database service meet your expectations following the activation of the public network? If an unexpected issue arose, we encourage you to promptly evaluate the application status and identify the fault. Meanwhile, Baidu engineers will conduct checks on relevant functions to ensure smooth application performance. |
| Launching of public network succeeded | openEipSucc | Notification | Did the database service meet your expectations following the activation of the public network? If an unexpected issue arose, we encourage you to promptly evaluate the application status and identify the fault. Meanwhile, Baidu engineers will conduct checks on relevant functions to ensure smooth application performance. |
| Launching of public network failed | openEipFail | Warning | Did the database service meet your expectations following the activation of the public network? If an unexpected issue arose, we encourage you to promptly evaluate the application status and identify the fault. Meanwhile, Baidu engineers will conduct checks on relevant functions to ensure smooth application performance. |
Baidu object storage (BOS)
| Event name in Chinese | Event type | Event Level | Solutions and suggestions |
|---|---|---|---|
| Bucket upload timeout | BucketUploadTimeout | Warning | The Bucket (name: BucketName) you created in the North China - Beijing Region has experienced an upload timeout, please handle it as soon as possible. Thank you |
| Bucket download timeout | BucketDownloadTimeout | Warning | The Bucket (name: BucketName) you created in the North China - Beijing Region has experienced an download timeout, please handle it as soon as possible. Thank you |
| The total bandwidth of the Bucket is about to exceed the traffic control | BucketBandwidthThresholdExceededSoon | Notification | The Bucket (name: BucketName) you created in the North China - Beijing Region is about to exceed the bandwidth threshold, please contact us in time through the work order for adjustments. Thank you |
| The total bandwidth of the Bucket exceeded the traffic control | BucketBandwidthThresholdExceeded | Warning | The Bucket (name: BucketName) you created in the North China - Beijing Region has exceeded the bandwidth threshold, please contact us in time through the work order for adjustments. Thank you |
Product change
| Event name in Chinese | Event type | Event Level | Solutions and suggestions |
|---|---|---|---|
| Product launch | deployment | Notification | [Operation Notice] Baidu AI Cloud will adjust the default routing policy for the basic service Pods of the Suzhou C4 cluster [Operation Time] 01:00-02:00 on March 11, 2021 [Operation Impact]. This operation is theoretically unnoticeable |
Trading system billing
| Event name in Chinese | Event type | Event Level | Solutions and suggestions |
|---|---|---|---|
| The resource is about to expire | ResourcesExpiration | Warning | If you need to continue using it, please renew it in time or activate the automatic renewal function;if it is no longer expected to be used, please back up the data in advance.Thank you for your support |
| The resource is about to be released | ResourcesRelease | Warning | If you need to continue using it, please renew it in time or activate the automatic renewal function; if you no longer need it,please back up the data in advance.Thank you for your support |
Cloud database DocDB for MongoDB
| Event name in Chinese | Event type | Event Level | Solutions and suggestions |
|---|---|---|---|
| The specification change has begun | SpecificationChangesStart | NOTICE | Manually resizing continuously evaluates the application's status to ensure it meets your expectations. Baidu engineers will concurrently check related functions on the Baidu side to guarantee the smooth operation of your application. |
| The specification change succeeded | SpecificationChangesSuccess | NOTICE | The instance resizing process has started. Please assess your application's status promptly. Baidu engineers will concurrently check related functions on the Baidu side to guarantee the smooth operation of your application. |
| The specification change failed | SpecificationChangesFailed | WARNING | The instance resizing process failed. Please assess your application's status promptly. Baidu engineers will concurrently check related functions on the Baidu side to guarantee the smooth operation of your application. |
| Instance node fault | NodeFailureStatus | WARNING | The instance node is malfunctioning. Please assess your application's status promptly. Baidu engineers will concurrently check related functions on the Baidu side to guarantee the smooth operation of your application. |
| Master-slave switch of instance node | NodeSwitchOverStatus | NOTICE | The instance has performed a master-slave switch. Please assess your application's status promptly. Baidu engineers will concurrently check related functions on the Baidu side to guarantee the smooth operation of your application. |
Cloud native database GaiaDB
| Event name in Chinese | Event type | Event Level | Solutions and suggestions |
|---|---|---|---|
| The master-slave switch has begun | switchOverStart | Notification | Did the database services meet your expectations after the master-slave switch? If the switch fault was unexpected, we recommend assessing the application's status promptly and pinpointing the fault source. Baidu engineers will concurrently check related functions on the Baidu side to guarantee the smooth operation of your application. |
| The master-slave switch succeeded | switchOverSuccess | Notification | Did the database services meet your expectations after the master-slave switch? If the switch fault was unexpected, we recommend assessing the application's status promptly and pinpointing the fault source. Baidu engineers will concurrently check related functions on the Baidu side to guarantee the smooth operation of your application. |
| The master-slave switch failed | switchOverFailed | Warning | Did the database services meet your expectations after the master-slave switch? If the switch fault was unexpected, we recommend assessing the application's status promptly and pinpointing the fault source. Baidu engineers will concurrently check related functions on the Baidu side to guarantee the smooth operation of your application. |
| The resizing of the computing node has begun | SpecificationChangesStart | Notification | Did the database service meet your expectations after resizing the computing node? If the instance resizing fault was unexpected, we recommend assessing the application's status promptly and pinpointing the fault source. Baidu engineers will concurrently check related functions on the Baidu side to guarantee the smooth operation of your application. |
| The resizing of the computing node succeeded | SpecificationChangesSuccess | Notification | Did the database service meet your expectations after resizing the computing node? If the instance resizing fault was unexpected, we recommend assessing the application's status promptly and pinpointing the fault source. Baidu engineers will concurrently check related functions on the Baidu side to guarantee the smooth operation of your application. |
| The resizing of the computing node failed | SpecificationChangesFailed | Warning | Did the database service meet your expectations after resizing the computing node? If the instance resizing fault was unexpected, we recommend assessing the application's status promptly and pinpointing the fault source. Baidu engineers will concurrently check related functions on the Baidu side to guarantee the smooth operation of your application. |
| The cluster cloning has begun | CloneStart | Notification | Did the database service meet your expectations following the cluster cloning? If the cluster cloning was unintentional, we recommend assessing the application's status promptly and pinpointing the fault source. Baidu engineers will concurrently check related functions on the Baidu side to guarantee the smooth operation of your application. |
| The cluster cloning succeeded | CloneSuccess | Notification | Did the database service meet your expectations following the cluster cloning? If the cluster cloning was unintentional, we recommend assessing the application's status promptly and pinpointing the fault source. Baidu engineers will concurrently check related functions on the Baidu side to guarantee the smooth operation of your application. |
| The cluster cloning failed | CloneFailed | Warning | The restoration of data from cluster cloning failed. If the cluster cloning fault was unintentional, we recommend assessing the application's status promptly and pinpointing the fault source. Baidu engineers will concurrently check related functions on the Baidu side to guarantee the smooth operation of your application. |
| Fault switch has begun | failOverStart | Notification | Did the database service meet your expectations after resolving the switch malfunction? If the switch fault was unexpected, we recommend assessing the application's status promptly and pinpointing the fault source. Baidu engineers will concurrently check related functions on the Baidu side to guarantee the smooth operation of your application. |
| Fault switch succeeded | failOverSuccess | Notification | Did the database service meet your expectations after resolving the switch malfunction? If the switch fault was unexpected, we recommend assessing the application's status promptly and pinpointing the fault source. Baidu engineers will concurrently check related functions on the Baidu side to guarantee the smooth operation of your application. |
| Fault switch failed | failOverFailed | Warning | Did the database service meet your expectations after resolving the switch malfunction? If the switch fault was unexpected, we recommend assessing the application's status promptly and pinpointing the fault source. Baidu engineers will concurrently check related functions on the Baidu side to guarantee the smooth operation of your application. |
| Fault injection has begun | failInjectStart | Notification | Did the database service meet your expectations following the fault injection? If the fault injection was unexpected, we hope you can promptly examine the application status and pinpoint the issue. Simultaneously, Baidu engineers will verify related functions to ensure consistent application performance. |
| Fault injection succeeded | failInjectSuccess | Notification | Did the database service meet your expectations after the instance fault injection? If the fault injection was unexpected, we urge you to promptly evaluate the application status and identify the issue. Meanwhile, Baidu engineers will check related functions to maintain seamless application operation. |
| Fault injection failed | failInjectFail | Warning | Did the database service meet your expectations after the instance fault injection? If the fault injection was unexpected, we urge you to promptly evaluate the application status and identify the issue. Meanwhile, Baidu engineers will check related functions to maintain seamless application operation. |
