CCE Resource Recommender User Documentation

CCE CCE

  • Function Release Records
  • Common Tools
    • Command Line Scenario Examples
  • API Reference
    • Overview
    • Common Headers and Error Responses
    • General Description
  • Product Announcement
    • Announcement on the Discontinuation of CCE Standalone Clusters
    • CCE New Cluster Management Release Announcement
    • Upgrade Announcement for CCE Cluster Audit Component kube-external-auditor
    • CCE Console Upgrade Announcement
    • Announcement on Management Fees for CCE Managed Clusters
    • Container Runtime Version Release Notes
    • Announcement on the Decommissioning of CCE Image Repository
    • Kubernetes Version Release Notes
      • CCE Release of Kubernetes v1_26 History
      • CCE Kubernetes Version Update Notes
      • CCE Release of Kubernetes v1_24 History
      • CCE Release of Kubernetes v1_30 History
      • CCE Release of Kubernetes v1_22 History
      • CCE Release of Kubernetes v1_18 History
      • CCE Release of Kubernetes v1_20 History
      • CCE Release of Kubernetes v1_28 History
      • Release Notes for CCE Kubernetes 1_31 Version
      • Kubernetes Version Overview and Mechanism
    • Security Vulnerability Fix Announcement
      • Vulnerability CVE-2019-5736 Fix Announcement
      • Vulnerability CVE-2021-30465 Fix Announcement
      • CVE-2025-1097, CVE-2025-1098, and Other Vulnerabilities Fix Announcement
      • CVE-2020-14386 Vulnerability Fix Announcement
      • Impact Statement on runc Security Issue (CVE-2024-21626)
  • Service Level Agreement (SLA)
    • CCE Service Level Agreement SLA (V1_0)
  • Typical Practices
    • Pod Anomaly Troubleshooting
    • Adding CGroup V2 Node
    • Common Linux System Configuration Parameters Description
    • Encrypting etcd Data Using KMS
    • Configuring Container Network Parameters Using CNI
    • CCE - Public Network Access Practice
    • Practice of using private images in CCE clusters
    • Unified Access for Virtual Machines and Container Services via CCE Ingress
    • User Guide for Custom CNI Plugins
    • CCE Cluster Network Description and Planning
    • Cross-Cloud Application Migration to Baidu CCE Using Velero
    • CCE Resource Recommender User Documentation
    • Continuous Deployment with Jenkins in CCE Cluster
    • CCE Best Practice-Guestbook Setup
    • CCE Best Practice-Container Network Mode Selection
    • CCE Usage Checklist
    • VPC-ENI Mode Cluster Public Network Access Practice
    • CCE Container Runtime Selection
    • Cloud-native AI
      • Elastic and Fault-Tolerant Training Using CCE AITraining Operator
      • Deploy the TensorFlow Serving inference service
      • Best Practice for GPU Virtualization with Optimal Isolation
  • FAQs
    • How do business applications use load balancer
    • Using kubectl on Windows
    • Cluster management FAQs
    • Common Questions Overview
    • Auto scaling FAQs
    • Create a simple service via kubectl
  • Operation guide
    • Prerequisites for use
    • Identity and access management
    • Permission Management
      • Configure IAM Tag Permission Policy
      • Permission Overview
      • Configure IAM Custom Permission Policy
      • Configure Predefined RBAC Permission Policy
      • Configure IAM Predefined Permission Policy
      • Configure Cluster OIDC Authentication
    • Configuration Management
      • Configmap Management
      • Secret Management
    • Traffic access
      • BLB ingress annotation description
      • Use K8S_Service via CCE
      • Use K8S_Ingress via CCE
      • Implement Canary Release with CCE Based on Nginx-Ingress
      • Create CCE_Ingress via YAML
      • LoadBalancer Service Annotation Description
      • Service Reuses Existing Load Balancer BLB
      • Use Direct Pod Mode LoadBalancer Service
      • NGINX Ingress Configuration Reference
      • Create LoadBalancer_Service via YAML
      • Use NGINX Ingress
    • Virtual Node
      • Configuring BCIPod
      • Configuring bci-profile
      • Managing virtual nodes
    • Node management
      • Add a node
      • Managing Taints
      • Setting Node Blocking
      • Setting GPU Memory Sharing
      • Remove a node
      • Customizing Kubelet Parameters
      • Kubelet Container Monitor Read-Only Port Risk Warning
      • Managing Node Tag
      • Drain node
    • Component Management
      • CCE CSI CDS Plugin Description
      • CCE Fluid Description
      • CCE CSI PFS L2 Plugin
      • CCE Calico Felix Description
      • CCE Ingress Controller Description
      • CCE QoS Agent Description
      • CCE GPU Manager Description
      • CCE Ingress NGINX Controller Description
      • CCE P2P Accelerator Description
      • CCE Virtual Kubelet Component
      • CoreDNS Description
      • CCE Log Operator Description
      • CCE Node Remedier Description
      • CCE Descheduler Description
      • CCE Dynamic Scheduling Plugin Description
      • Kube Scheduler Documentation
      • CCE NPU Manager Description
      • CCE CronHPA Controller Description
      • CCE LB Controller Description
      • Kube ApiServer Description
      • CCE Backup Controller Description
      • CCE Network Plugin Description
      • CCE CSI PFS Plugin Description
      • CCE Credential Controller Description
      • CCE Deep Learning Frameworks Operator Description
      • Component Overview
      • CCE Image Accelerate Description
      • CCE CSI BOS Plugin Description
      • CCE Onepilot Description
      • Description of Kube Controller Manager
      • CCE_Hybrid_Manager Description
      • CCE NodeLocal DNSCache Description
      • CCE Node Problem Detector Description
      • CCE Ascend Mindx DL Description
      • CCE RDMA Device Plugin Description
      • CCE AI Job Scheduler Description
    • Image registry
      • Image Registry Basic Operations
      • Using Container Image to Build Services
    • Helm Management
      • Helm Template
      • Helm Instance
    • Cluster management
      • Upgrade Cluster Kubernetes Version
      • CCE Node CDS Dilatation
      • Managed Cluster Usage Instructions
      • Create cluster
      • CCE Supports GPUSharing Cluster
      • View Cluster
      • Connect to Cluster via kubectl
      • CCE Security Group
      • CCE Node Resource Reservation Instructions
      • Operate Cluster
      • Cluster Snapshot
    • Serverless Cluster
      • Product overview
      • Using Service in Serverless Cluster
      • Creating a Serverless Cluster
    • Storage Management
      • Using Cloud File System
      • Overview
      • Using Parallel File System PFS
      • Using RapidFS
      • Using Object Storage BOS
      • Using Parallel File System PFS L2
      • Using Local Storage
      • Using Cloud Disk CDS
    • Inspection and Diagnosis
      • Cluster Inspection
      • GPU Runtime Environment Check
      • Fault Diagnosis
    • Cloud-native AI
      • Cloud-Native AI Overview
      • AI Monitoring Dashboard
        • Connecting to a Prometheus Instance and Starting a Job
        • NVIDIA Chip Resource Observation
          • AI Job Scheduler component
          • GPU node resources
          • GPU workload resources
          • GPUManager component
          • GPU resource pool overview
        • Ascend Chip Resource Observation
          • Ascend resource pool overview
          • Ascend node resource
          • Ascend workload resource
      • Task Management
        • View Task Information
        • Create TensorFlow Task
        • Example of RDMA Distributed Training Based on NCCL
        • Create PaddlePaddle Task
        • Create AI Training Task
        • Delete task
        • Create PyTorch Task
        • Create Mxnet Task
      • Queue Management
        • Modify Queue
        • Create Queue
        • Usage Instructions for Logical Queues and Physical Queues
        • Queue deletion
      • Dataset Management
        • Create Dataset
        • Delete dataset
        • View Dataset
        • Operate Dataset
      • AI Acceleration Kit
        • AIAK Introduction
        • Using AIAK-Training PyTorch Edition
        • Deploying Distributed Training Tasks Using AIAK-Training
        • Accelerating Inference Business Using AIAK-Inference
      • GPU Virtualization
        • GPU Exclusive and Shared Usage Instructions
        • Image Build Precautions in Shared GPU Scenarios
        • Instructions for Multi-GPU Usage in Single-GPU Containers
        • GPU Virtualization Adaptation Table
        • GPU Online and Offline Mixed Usage Instructions
        • MPS Best Practices & Precautions
        • Precautions for Disabling Node Video Memory Sharing
    • Elastic Scaling
      • Container Timing Horizontal Scaling (CronHPA)
      • Container Horizontal Scaling (HPA)
      • Implementing Second-Level Elastic Scaling with cce-autoscaling-placeholder
      • CCE Cluster Node Auto-Scaling
    • Network Management
      • How to Continue Dilatation When Container Network Segment Space Is Exhausted (VPC-ENI Mode)
      • Container Access to External Services in CCE Clusters
      • CCE supports dual-stack networks of IPv4 and IPv6
      • Using NetworkPolicy Network Policy
      • Traffic Forwarding Configuration for Containers in Peering Connections Scenarios
      • CCE IP Masquerade Agent User Guide
      • Creating VPC-ENI Mode Cluster
      • How to Continue Dilatation When Container Network Segment Space Is Exhausted (VPC Network Mode)
      • Using NetworkPolicy in CCE Clusters
      • Network Orchestration
        • Container Network QoS Management
        • VPC-ENI Specified Subnet IP Allocation (Container Network v2)
        • Cluster Pod Subnet Topology Distribution (Container Network v2)
      • Network Connectivity
        • Container network accesses the public network via NAT gateway
      • Network Maintenance
        • Common Error Code Table for CCE Container Network
      • DNS
        • CoreDNS Component Manual Dilatation Guide
        • DNS Troubleshooting Guide
        • DNS Principle Overview
    • Namespace Management
      • Set Limit Range
      • Set Resource Quota
      • Basic Namespace Operations
    • Workload
      • CronJob Management
      • Set Workload Auto-Scaling
      • Deployment Management
      • Job Management
      • View the Pod
      • StatefulSet Management
      • Password-Free Pull of Container Image
      • Create Workload Using Private Image
      • DaemonSet Management
    • Monitor Logs
      • Monitor Cluster with Prometheus
      • CCE Event Center
      • Cluster Service Profiling
      • CCE Cluster Anomaly Event Alerts
      • Java Application Monitor
      • Cluster Audit Dashboard
      • Logging
      • Cluster Audit
      • Log Center
        • Configure Collection Rules Using CRD
        • View Cluster Control Plane Logs
        • View Business Logs
        • Log Overview
        • Configure Collection Rules in Cloud Container Engine Console
    • Application management
      • Overview
      • Secret
      • Configuration dictionary
      • Deployment
      • Service
      • Pod
    • NodeGroup Management
      • NodeGroup Management
      • NodeGroup Node Fault Detection and Self-Healing
      • Configuring Scaling Policies
      • NodeGroup Introduction
      • Adding Existing External Nodes
      • Custom NodeGroup Kubelet Configuration
      • Adding Alternative Models
      • Dilatation NodeGroup
    • Backup Center
      • Restore Management
      • Backup Overview
      • Backup Management
      • Backup repository
  • Quick Start
    • Quick Deployment of Nginx Application
    • CCE Container Engine Usage Process Overview
  • Product pricing
    • Product pricing
  • Product Description
    • Application scenarios
    • Introduction
    • Usage restrictions
    • Features
    • Advantages
    • Core concepts
  • Solution-Fabric
    • Fabric Solution
  • Development Guide
    • EFK Log Collection System Deployment Guide
    • Using Network Policy in CCE Cluster
    • Creating a LoadBalancer-Type Service
    • Prometheus Monitoring System Deployment Guide
    • kubectl Management Configuration
  • API_V2 Reference
    • Overview
    • Common Headers and Error Responses
    • Cluster Related Interfaces
    • Instance Related Interfaces
    • Service domain
    • General Description
    • Kubeconfig Related Interfaces
    • RBAC Related Interfaces
    • Autoscaler Related Interfaces
    • Network Related Interfaces
    • InstanceGroup Related Interfaces
    • Appendix
    • Component management-related APIs
    • Package adaptation-related APIs
    • Task Related Interfaces
  • Solution-Xchain
    • Hyperchain Solution
  • SDK
    • Go-SDK
      • Overview
      • NodeGroup Management
      • Initialization
      • Install the SDK Package
      • Cluster management
      • Node management
All documents
menu
No results found, please re-enter

CCE CCE

  • Function Release Records
  • Common Tools
    • Command Line Scenario Examples
  • API Reference
    • Overview
    • Common Headers and Error Responses
    • General Description
  • Product Announcement
    • Announcement on the Discontinuation of CCE Standalone Clusters
    • CCE New Cluster Management Release Announcement
    • Upgrade Announcement for CCE Cluster Audit Component kube-external-auditor
    • CCE Console Upgrade Announcement
    • Announcement on Management Fees for CCE Managed Clusters
    • Container Runtime Version Release Notes
    • Announcement on the Decommissioning of CCE Image Repository
    • Kubernetes Version Release Notes
      • CCE Release of Kubernetes v1_26 History
      • CCE Kubernetes Version Update Notes
      • CCE Release of Kubernetes v1_24 History
      • CCE Release of Kubernetes v1_30 History
      • CCE Release of Kubernetes v1_22 History
      • CCE Release of Kubernetes v1_18 History
      • CCE Release of Kubernetes v1_20 History
      • CCE Release of Kubernetes v1_28 History
      • Release Notes for CCE Kubernetes 1_31 Version
      • Kubernetes Version Overview and Mechanism
    • Security Vulnerability Fix Announcement
      • Vulnerability CVE-2019-5736 Fix Announcement
      • Vulnerability CVE-2021-30465 Fix Announcement
      • CVE-2025-1097, CVE-2025-1098, and Other Vulnerabilities Fix Announcement
      • CVE-2020-14386 Vulnerability Fix Announcement
      • Impact Statement on runc Security Issue (CVE-2024-21626)
  • Service Level Agreement (SLA)
    • CCE Service Level Agreement SLA (V1_0)
  • Typical Practices
    • Pod Anomaly Troubleshooting
    • Adding CGroup V2 Node
    • Common Linux System Configuration Parameters Description
    • Encrypting etcd Data Using KMS
    • Configuring Container Network Parameters Using CNI
    • CCE - Public Network Access Practice
    • Practice of using private images in CCE clusters
    • Unified Access for Virtual Machines and Container Services via CCE Ingress
    • User Guide for Custom CNI Plugins
    • CCE Cluster Network Description and Planning
    • Cross-Cloud Application Migration to Baidu CCE Using Velero
    • CCE Resource Recommender User Documentation
    • Continuous Deployment with Jenkins in CCE Cluster
    • CCE Best Practice-Guestbook Setup
    • CCE Best Practice-Container Network Mode Selection
    • CCE Usage Checklist
    • VPC-ENI Mode Cluster Public Network Access Practice
    • CCE Container Runtime Selection
    • Cloud-native AI
      • Elastic and Fault-Tolerant Training Using CCE AITraining Operator
      • Deploy the TensorFlow Serving inference service
      • Best Practice for GPU Virtualization with Optimal Isolation
  • FAQs
    • How do business applications use load balancer
    • Using kubectl on Windows
    • Cluster management FAQs
    • Common Questions Overview
    • Auto scaling FAQs
    • Create a simple service via kubectl
  • Operation guide
    • Prerequisites for use
    • Identity and access management
    • Permission Management
      • Configure IAM Tag Permission Policy
      • Permission Overview
      • Configure IAM Custom Permission Policy
      • Configure Predefined RBAC Permission Policy
      • Configure IAM Predefined Permission Policy
      • Configure Cluster OIDC Authentication
    • Configuration Management
      • Configmap Management
      • Secret Management
    • Traffic access
      • BLB ingress annotation description
      • Use K8S_Service via CCE
      • Use K8S_Ingress via CCE
      • Implement Canary Release with CCE Based on Nginx-Ingress
      • Create CCE_Ingress via YAML
      • LoadBalancer Service Annotation Description
      • Service Reuses Existing Load Balancer BLB
      • Use Direct Pod Mode LoadBalancer Service
      • NGINX Ingress Configuration Reference
      • Create LoadBalancer_Service via YAML
      • Use NGINX Ingress
    • Virtual Node
      • Configuring BCIPod
      • Configuring bci-profile
      • Managing virtual nodes
    • Node management
      • Add a node
      • Managing Taints
      • Setting Node Blocking
      • Setting GPU Memory Sharing
      • Remove a node
      • Customizing Kubelet Parameters
      • Kubelet Container Monitor Read-Only Port Risk Warning
      • Managing Node Tag
      • Drain node
    • Component Management
      • CCE CSI CDS Plugin Description
      • CCE Fluid Description
      • CCE CSI PFS L2 Plugin
      • CCE Calico Felix Description
      • CCE Ingress Controller Description
      • CCE QoS Agent Description
      • CCE GPU Manager Description
      • CCE Ingress NGINX Controller Description
      • CCE P2P Accelerator Description
      • CCE Virtual Kubelet Component
      • CoreDNS Description
      • CCE Log Operator Description
      • CCE Node Remedier Description
      • CCE Descheduler Description
      • CCE Dynamic Scheduling Plugin Description
      • Kube Scheduler Documentation
      • CCE NPU Manager Description
      • CCE CronHPA Controller Description
      • CCE LB Controller Description
      • Kube ApiServer Description
      • CCE Backup Controller Description
      • CCE Network Plugin Description
      • CCE CSI PFS Plugin Description
      • CCE Credential Controller Description
      • CCE Deep Learning Frameworks Operator Description
      • Component Overview
      • CCE Image Accelerate Description
      • CCE CSI BOS Plugin Description
      • CCE Onepilot Description
      • Description of Kube Controller Manager
      • CCE_Hybrid_Manager Description
      • CCE NodeLocal DNSCache Description
      • CCE Node Problem Detector Description
      • CCE Ascend Mindx DL Description
      • CCE RDMA Device Plugin Description
      • CCE AI Job Scheduler Description
    • Image registry
      • Image Registry Basic Operations
      • Using Container Image to Build Services
    • Helm Management
      • Helm Template
      • Helm Instance
    • Cluster management
      • Upgrade Cluster Kubernetes Version
      • CCE Node CDS Dilatation
      • Managed Cluster Usage Instructions
      • Create cluster
      • CCE Supports GPUSharing Cluster
      • View Cluster
      • Connect to Cluster via kubectl
      • CCE Security Group
      • CCE Node Resource Reservation Instructions
      • Operate Cluster
      • Cluster Snapshot
    • Serverless Cluster
      • Product overview
      • Using Service in Serverless Cluster
      • Creating a Serverless Cluster
    • Storage Management
      • Using Cloud File System
      • Overview
      • Using Parallel File System PFS
      • Using RapidFS
      • Using Object Storage BOS
      • Using Parallel File System PFS L2
      • Using Local Storage
      • Using Cloud Disk CDS
    • Inspection and Diagnosis
      • Cluster Inspection
      • GPU Runtime Environment Check
      • Fault Diagnosis
    • Cloud-native AI
      • Cloud-Native AI Overview
      • AI Monitoring Dashboard
        • Connecting to a Prometheus Instance and Starting a Job
        • NVIDIA Chip Resource Observation
          • AI Job Scheduler component
          • GPU node resources
          • GPU workload resources
          • GPUManager component
          • GPU resource pool overview
        • Ascend Chip Resource Observation
          • Ascend resource pool overview
          • Ascend node resource
          • Ascend workload resource
      • Task Management
        • View Task Information
        • Create TensorFlow Task
        • Example of RDMA Distributed Training Based on NCCL
        • Create PaddlePaddle Task
        • Create AI Training Task
        • Delete task
        • Create PyTorch Task
        • Create Mxnet Task
      • Queue Management
        • Modify Queue
        • Create Queue
        • Usage Instructions for Logical Queues and Physical Queues
        • Queue deletion
      • Dataset Management
        • Create Dataset
        • Delete dataset
        • View Dataset
        • Operate Dataset
      • AI Acceleration Kit
        • AIAK Introduction
        • Using AIAK-Training PyTorch Edition
        • Deploying Distributed Training Tasks Using AIAK-Training
        • Accelerating Inference Business Using AIAK-Inference
      • GPU Virtualization
        • GPU Exclusive and Shared Usage Instructions
        • Image Build Precautions in Shared GPU Scenarios
        • Instructions for Multi-GPU Usage in Single-GPU Containers
        • GPU Virtualization Adaptation Table
        • GPU Online and Offline Mixed Usage Instructions
        • MPS Best Practices & Precautions
        • Precautions for Disabling Node Video Memory Sharing
    • Elastic Scaling
      • Container Timing Horizontal Scaling (CronHPA)
      • Container Horizontal Scaling (HPA)
      • Implementing Second-Level Elastic Scaling with cce-autoscaling-placeholder
      • CCE Cluster Node Auto-Scaling
    • Network Management
      • How to Continue Dilatation When Container Network Segment Space Is Exhausted (VPC-ENI Mode)
      • Container Access to External Services in CCE Clusters
      • CCE supports dual-stack networks of IPv4 and IPv6
      • Using NetworkPolicy Network Policy
      • Traffic Forwarding Configuration for Containers in Peering Connections Scenarios
      • CCE IP Masquerade Agent User Guide
      • Creating VPC-ENI Mode Cluster
      • How to Continue Dilatation When Container Network Segment Space Is Exhausted (VPC Network Mode)
      • Using NetworkPolicy in CCE Clusters
      • Network Orchestration
        • Container Network QoS Management
        • VPC-ENI Specified Subnet IP Allocation (Container Network v2)
        • Cluster Pod Subnet Topology Distribution (Container Network v2)
      • Network Connectivity
        • Container network accesses the public network via NAT gateway
      • Network Maintenance
        • Common Error Code Table for CCE Container Network
      • DNS
        • CoreDNS Component Manual Dilatation Guide
        • DNS Troubleshooting Guide
        • DNS Principle Overview
    • Namespace Management
      • Set Limit Range
      • Set Resource Quota
      • Basic Namespace Operations
    • Workload
      • CronJob Management
      • Set Workload Auto-Scaling
      • Deployment Management
      • Job Management
      • View the Pod
      • StatefulSet Management
      • Password-Free Pull of Container Image
      • Create Workload Using Private Image
      • DaemonSet Management
    • Monitor Logs
      • Monitor Cluster with Prometheus
      • CCE Event Center
      • Cluster Service Profiling
      • CCE Cluster Anomaly Event Alerts
      • Java Application Monitor
      • Cluster Audit Dashboard
      • Logging
      • Cluster Audit
      • Log Center
        • Configure Collection Rules Using CRD
        • View Cluster Control Plane Logs
        • View Business Logs
        • Log Overview
        • Configure Collection Rules in Cloud Container Engine Console
    • Application management
      • Overview
      • Secret
      • Configuration dictionary
      • Deployment
      • Service
      • Pod
    • NodeGroup Management
      • NodeGroup Management
      • NodeGroup Node Fault Detection and Self-Healing
      • Configuring Scaling Policies
      • NodeGroup Introduction
      • Adding Existing External Nodes
      • Custom NodeGroup Kubelet Configuration
      • Adding Alternative Models
      • Dilatation NodeGroup
    • Backup Center
      • Restore Management
      • Backup Overview
      • Backup Management
      • Backup repository
  • Quick Start
    • Quick Deployment of Nginx Application
    • CCE Container Engine Usage Process Overview
  • Product pricing
    • Product pricing
  • Product Description
    • Application scenarios
    • Introduction
    • Usage restrictions
    • Features
    • Advantages
    • Core concepts
  • Solution-Fabric
    • Fabric Solution
  • Development Guide
    • EFK Log Collection System Deployment Guide
    • Using Network Policy in CCE Cluster
    • Creating a LoadBalancer-Type Service
    • Prometheus Monitoring System Deployment Guide
    • kubectl Management Configuration
  • API_V2 Reference
    • Overview
    • Common Headers and Error Responses
    • Cluster Related Interfaces
    • Instance Related Interfaces
    • Service domain
    • General Description
    • Kubeconfig Related Interfaces
    • RBAC Related Interfaces
    • Autoscaler Related Interfaces
    • Network Related Interfaces
    • InstanceGroup Related Interfaces
    • Appendix
    • Component management-related APIs
    • Package adaptation-related APIs
    • Task Related Interfaces
  • Solution-Xchain
    • Hyperchain Solution
  • SDK
    • Go-SDK
      • Overview
      • NodeGroup Management
      • Initialization
      • Install the SDK Package
      • Cluster management
      • Node management
  • Document center
  • arrow
  • CCECCE
  • arrow
  • Typical Practices
  • arrow
  • CCE Resource Recommender User Documentation
Table of contents on this page
  • Component introduction
  • Resource objects deployed in the cluster
  • Function description
  • CCE Resource Recommender recommendation principles
  • Note
  • Environmental requirements
  • Controlled resource requirements
  • Recommended calculation
  • Instructions for use
  • Install component
  • Installation process
  • 1. Enable CProm
  • 2. Install CCE Resource Recommender
  • 3. Verify whether the installation is successful
  • Uninstall CCE Resource Recommender
  • Obtain recommended values in the backend
  • Workload
  • Recommendation CR

CCE Resource Recommender User Documentation

Updated at:2025-10-27

Component introduction

  • Kubernetes enhances business orchestration and resource utilization

    • Kubernetes enhances business orchestration capabilities and resource utilization effectively, but without additional support, the improvements remain limited.
    • The main cause of low resource utilization in Kubernetes clusters lies in its resource scheduling logic. When creating Kubernetes workloads, users often need to configure appropriate resource requests, defining both resource occupation and limits, where the resource request has the most significant impact on utilization.
    • To ensure their workload's resources aren't preempted by others or to meet demands during peak traffic, users tend to set higher resource request values.
    • The gap between requested and actual resource usage cannot be leveraged by other workloads, leading to resource waste.
    • Setting unreasonable resource request values results in low resource utilization within Kubernetes clusters.
  • Baidu AI Cloud CCE allows the installation of the CCE Resource Recommender component in the cluster. This component suggests container-level resource request values for Kubernetes workloads, thereby minimizing resource waste.

Resource objects deployed in the cluster

Install the CCE Resource Recommender component, which will deploy the following Kubernetes objects in the cluster:

Kubernetes object name Types Namespaces to which Kubernetes object belongs
analytics.analysis.baidubce.com CustomResourceDefinition -
recommendations.analysis.baidubce.com CustomResourceDefinition -
analysis-default Analytics kube-system
recommendation-configuration ConfigMap kube-system
recommenderd ClusterRole -
recommenderd ClusterRoleBinding -
recommenderd Service kube-system
recommenderd ClusterRoleBinding kube-system
recommenderd ServiceAccount kube-system
recommenderd Deployment kube-system

Function description

Support intelligent recommendation of appropriate resource requests for each container in Deployment, StatefulSet, and DaemonSet. Support maintaining request ratios: Recommended requests will preserve the ratios between the initial container-set requests in the workload.

CCE Resource Recommender recommendation principles

The component creates an analytics CR object in the kube-system namespace, covering all Kubernetes-native workloads (Deployment, DaemonSet, StatefulSet) across all clusters, analyzing up to 14 days of monitor data and updating recommended values every 12 hours. And then, based on analytics, generate a recommendation CR object for each workload in the cluster to store the recommended data. If the recommendation CR generates recommended data, it will write this data into the annotation of the corresponding workload.

Note

Environmental requirements

  • Kubernetes version: 1.18+
  • Connect to Cloud Managed Service for Prometheus (CProm or self-built Prometheus data source)

Controlled resource requirements

  • Supports Deployment, StatefulSet, and DaemonSet workloads.
  • Does not support Job, CronJob, or standalone Pods that are not part of a workload.

Recommended calculation

  • Minimum recommended values: A single container should request at least 0.125 CPU cores (125m) and 125Mi of memory.
  • This component analyzes historical monitoring data of workloads automatically to recommend suitable resource request values.
  • Installation of this component does not take immediate effect. Historical resource usage data needs to be analyzed to calculate accurate recommendations.
  • The calculation time may vary for different workloads, and workloads within the cluster could impact each other's calculations.
  • After installing this component, recommended data will be generated for workloads that have been running for a minimum of one day.
  • For workloads created after the component installation, it typically takes one day to generate the recommended data for these workloads.
  • It is advised to update the workload with the recommended values after it has been running smoothly for a period.

Instructions for use

Install component

Install the official Baidu AI Cloud Helm Chart.

Installation process

1. Enable CProm

  • Log in to the CCE management console, navigate to Monitor Logs > Prometheus Monitor, and select Connect Instance.

WeChatfcc53cee49b890ce0367306592a43655.jpg

  • Once instance access is successful, click Redirect to Cloud Managed Service for Prometheus.

91721200119_.pic.jpg

  • On the Prometheus Instance page, copy the instance ID and remote read address, then generate and copy the token. This information will be needed in the next installation step.

WeChatea8d1e1c441d4815a91b65c01e0ca010.jpg

2. Install CCE Resource Recommender

WeChat3f850841d87999864e2434dc3524dc09.jpg

WeChat3d51ec437299499c96cf9706dd70f3e4.jpg

Parameters Description Default value Required or not
recommenderd.containerArgs.prometheus-address Prometheus address for recommenderd Empty (add /Prometheus after the Cprom address) Yes
recommenderd.containerArgs.prometheus-auth-instanceId Instance ID of recommenderd CProm Empty Yes
recommenderd.containerArgs.prometheus-auth-bearertoken CProm token information for recommenderd Empty Yes
analysisDefault.enable Whether to enable global resource recommendation default configuration
Default configuration true No
  • Example of adding parameters
YAML
1# Default values for cce-resource-recommender.
2# This is a YAML-formatted file.
3# Declare variables to be passed into your templates.
4recommenderd:
5  image:
6    repository: registry.baidubce.com/cce-plugin-pro/cce-resource-recommender
7    pullPolicy: IfNotPresent
8    # Overrides the image tag whose default is the chart appVersion.
9    tag: "v1.0.0"
10  replicaCount: 2
11  containerArgs:
12    feature-gates: Analysis=true
13    v: 2
14	prometheus-address: https://cprom.gz.baidubce.com/test/select/prometheus
15    prometheus-auth-instanceId: cprom-pjuun6b516c71
16    prometheus-auth-bearertoken: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJuYW1lc3BhY2UiOiJjcHJvbS1wanV1bjZiNTE2YzcxIiwic2VjcmV0TmFtZSI6ImVjYTk3ZTE0OGNiNzRlOTY4M2Q3YjcyNDA4MjlkMWZmIiwiZXhwIjoxNzgzODUwMTkyLCJpc3MiOiJjcHJvbSJ9.U5VkXlKbSJvOqPHWW_gGOhaEJA-hDdvsOyIHgYijacA
17  podAnnotations: { }
18  resources: { }
19  nodeSelector: { }
20  tolerations: [ ]
21  affinity: { }
22analysisDefault:
23  enable: true

3. Verify whether the installation is successful

Use the following command to check if the installed deployment is normal:

kubectl get deploy recommenderd -n kube-system

The results similar to the following information will be returned:

YAML
1NAME           READY   UP-TO-DATE   AVAILABLE   AGE
2recommenderd   2/2     2            2           37s

Uninstall CCE Resource Recommender

helm --kubeconfig {$kubeconfig} uninstall cce-resource-recommender -n kube-system

Obtain recommended values in the backend

Workload

The CCE Resource Recommender component saves the recommended values into the YAML file of the respective workload. You can retrieve recommended values for each workload via the standard Kubernetes API and integrate them into your business release system. Check the request recommendations for each container under the workload as follows:

YAML
1apiVersion: apps/v1
2kind: Deployment
3metadata:
4  annotations:
5    analysis.baidubce.com/resource-recommendation: |
6      containers:
7# If a Pod contains multiple containers, each container has recommended values for CPU and memory requests
8      - containerName: nginx
9        target:
10          cpu: 125m
11 memory: 125Mi # If the unit is missing here, it displays the string "58243235", with the omitted unit of byte
12    deployment.kubernetes.io/revision: "1"
13  creationTimestamp: "2024-06-11T03:15:57Z"
14  generation: 1
15  labels:
16    app: nginx
17  name: deployment-example
18  namespace: default
19  resourceVersion: "1118119"
20  uid: 8b6d54d9-c683-4e76-a95e-658e14a954b1
21spec:
22  progressDeadlineSeconds: 600
23  replicas: 1
24  revisionHistoryLimit: 10
25  selector:
26    matchLabels:
27      app: nginx
28  strategy:
29    rollingUpdate:
30      maxSurge: 25%
31      maxUnavailable: 25%
32    type: RollingUpdate
33  template:
34    metadata:
35      creationTimestamp: null
36      labels:
37        app: nginx
38    spec:
39      containers:
40      - image: hub.baidubce.com/cce/nginx-alpine-go:latest
41        imagePullPolicy: Always
42        livenessProbe:
43          failureThreshold: 3
44          httpGet:
45            path: /
46            port: 80
47            scheme: HTTP
48          initialDelaySeconds: 20
49          periodSeconds: 5
50          successThreshold: 1
51          timeoutSeconds: 5
52        name: nginx
53        ports:
54        - containerPort: 80
55          protocol: TCP
56        readinessProbe:
57          failureThreshold: 3
58          httpGet:
59            path: /
60            port: 80
61            scheme: HTTP
62          initialDelaySeconds: 5
63          periodSeconds: 5
64          successThreshold: 1
65          timeoutSeconds: 1
66        resources:
67          limits:
68            cpu: 250m
69            memory: 512Mi
70          requests:
71            cpu: 250m
72            memory: 512Mi
73        terminationMessagePath: /dev/termination-log
74        terminationMessagePolicy: File
75      dnsPolicy: ClusterFirst
76      restartPolicy: Always
77      schedulerName: default-scheduler
78      securityContext: {}
79      terminationGracePeriodSeconds: 30

Recommendation CR

The CCE Resource Recommender component generates a recommendation CR object for each workload in the cluster based on analysis results to store recommended data, while also saving recommended values to the workload's YAML

YAML
1apiVersion: analysis.baidubce.com/v1alpha1
2kind: Recommendation
3metadata:
4  annotations:
5    analysis.baidubce.com/run-number: "1"
6  creationTimestamp: "2024-06-11T04:54:27Z"
7  generateName: analysis-default-resource-
8  generation: 2
9  labels:
10    analysis.baidubce.com/analytics-uid: 83cccfd5-b3c5-45aa-a92a-d9dd607dc75f
11    analysis.baidubce.com/recommendation-rule-name: analysis-default
12    analysis.baidubce.com/recommendation-rule-recommender: Resource
13    analysis.baidubce.com/recommendation-rule-uid: bce27929-64d6-4b5f-89e4-001cbed5ed64
14    analysis.baidubce.com/recommendation-target-kind: StatefulSet
15    analysis.baidubce.com/recommendation-target-name: agent-q7vl19h81
16    analysis.baidubce.com/recommendation-target-version: v1
17    app.kubernetes.io/component: vmagent
18    app.kubernetes.io/instance: agent-q7vl19h81
19    app.kubernetes.io/managed-by: Helm
20    app.kubernetes.io/name: monitor-agent
21    app.kubernetes.io/version: 0.2.0
22    helm.sh/chart: monitor-agent-0.3.6
23  name: analysis-default-resource-2gzjt
24  namespace: default
25  ownerReferences:
26  - apiVersion: analysis.baidubce.com/v1alpha1
27    blockOwnerDeletion: false
28    controller: false
29    kind: RecommendationRule
30    name: analysis-default
31    uid: bce27929-64d6-4b5f-89e4-001cbed5ed64
32  resourceVersion: "1118082"
33  uid: 4793159d-83b5-45db-8c34-c3443a7c45cd
34spec:
35  adoptionType: StatusAndAnnotation
36  completionStrategy:
37    completionStrategyType: Once
38  targetRef:
39    apiVersion: apps/v1
40    kind: StatefulSet
41    name: agent-q7vl19h81
42    namespace: cprom-system
43  type: Resource
44status:
45  action: Patch
46  conditions:
47  - lastTransitionTime: "2024-06-11T04:54:28Z"
48    message: Recommendation is ready
49    reason: RecommendationReady
50    status: "True"
51    type: Ready
52  currentInfo: '{"spec":{"template":{"spec":{"containers":[{"name":"sidecar","resources":{"requests":{"cpu":"100m","memory":"100Mi"}}},{"name":"vmagent","resources":{"requests":{"cpu":"100m","memory":"100Mi"}}}]}}}}'
53  lastUpdateTime: "2024-06-11T04:54:28Z"
54  recommendedInfo: '{"spec":{"template":{"spec":{"containers":[{"name":"sidecar","resources":{"requests":{"cpu":"125m","memory":"125Mi"}}},{"name":"vmagent","resources":{"requests":{"cpu":"125m","memory":"125Mi"}}}]}}}}'
55  recommendedValue: |
56    resourceRequest:
57      containers:
58      - containerName: sidecar
59        target:
60          cpu: 125m
61          memory: 125Mi
62      - containerName: vmagent
63        target:
64          cpu: 125m
65          memory: 125Mi
66  targetRef: {}

In this example:

  • The recommended TargetRef points to the StatefulSet of cprom-system: agent-q7vl19h81
  • The recommended type is resource recommendation
  • adoptionType is StatusAndAnnotation, indicating that recommendation results are displayed in recommendation.status and the annotation of deployment
  • recommendedInfo displays the recommended resource configuration, and currentInfo shows the current resource configuration in Json format, which can be updated to TargetRef via Kubectl Patch

Previous
Cross-Cloud Application Migration to Baidu CCE Using Velero
Next
Continuous Deployment with Jenkins in CCE Cluster