CCE Usage Checklist

Updated at：2025-10-27

Overview

CCE provides a container management service based on native Kubernetes. To help users better leverage CCE, we’ve compiled a checklist of best practices covering cluster management, application deployment, and troubleshooting. We strongly encourage CCE users to review this checklist before starting or launching services to ensure a smooth transition to CCE and minimize risks like application issues or cluster reconfiguration due to improper usage.

Cluster check items

Types	Item	Suggestions	Reference documentation
Cluster	Count of nodes	Regardless of service scale, for online service, it is strongly recommended to maintain a cluster node count of at least greater than 1, and reserve certain resource buffer to prevent service failure from single-point failures.
	Node password	The root password of node must be strong.
	Node network	For clusters requiring external access, it is not recommended to bind EIPs directly to nodes, as this could expose them to security risks. Instead, enable public network access by attaching a NAT gateway to the cluster's VPC network.	Container network accesses the public network via NAT gateway
	VPC route	The CCE container network relies on VPC routing. Routing rules created by CCE are marked as "auto generated by cce." When creating new routes, users must avoid conflicts with these rules. If conflicts are unavoidable, submit a support ticket for assistance.
	Security group	When configuring a security group, ensure it allows access to the node network, container network, and the 100.64.230.0/24 network segment, along with ports 22, 6443, and 30000–32768. Failure to do so could result in network issues for the cloud container engine.
	Disk capacity	When creating a cluster, it is highly recommended to allocate at least 100GB of CDS storage to the node (this option is set as the default in CCE).
	Node scaling	Scaling operations require root-level access to the machine, which may lead to insufficient cluster capacity. Since CCE does not directly support virtual machine scaling, it is advised to perform this action via the BCC page. Always scale down before scaling up to minimize service disruptions.
	Virtual Machine Monitor	Excessive use of CPU, memory, or disk resources on a VM may impact cluster stability. CCE includes an eviction mechanism that migrates certain instances when node load becomes too high. It is highly recommended to set up node monitoring alerts in BCM.	Add alarm on BCM
	Baidu AI Cloud third-party resources	It is strongly advised that CCE users avoid directly modifying the resources (including names and other configurations) created by CCE on the BCC, DCC, VPC, BLB, or EIP product pages, as this could lead to unintended issues.

Application check items

Types	Item	Suggestions	Reference documentation
Application	Image	When building Docker images, users are encouraged to include common debugging tools such as ping, telnet, curl, and vim, which can be customized as needed.
	Private image	If a container uses a private image, it is essential to configure a secret.	Practice of using private images in K83S Cluster CCEs
	Count of instance replicas	For stateless services without conflicts, it is advised to have more than 2 instance replicas to prevent service interruptions caused by single-point failures, ensuring continuity even during instance migrations.
	Limit range	It is highly recommended that all deployed services configure resource.limits.	Kubernetes limit range
	Health check	It is advisable to configure all launched services with liveness and readiness probes to enable automatic failover and ensure service reliability.	Kubernetes health check
	Service exposure mode	Intra-cluster access: ClusterIP Service; Extra-cluster access: LB Service; Extra-cluster access (HTTP/HTTPS): Ingress	LoadBalancer ingress network traffic Ingress network traffic
	Service data persistence	For services requiring data persistence, it is recommended to use PV and PVC modes. CCE currently supports Cloud File System (CFS), Cloud Disk Service (CDS), and Baidu AI Cloud Object Storage (BOS) via the PV/PVC mode.	Using CFS via PV/PVC mode Using CDS via PV/PVC mode Using BOS via PV/PVC mode

Common troubleshooting

1. Does the container fail to start?

Generally, you can view the error messages by the following two methods:

kubectl describe podName
kubectl logs podName

If no obvious errors are found by the above methods, you can modify the start command of the container in YAML, for example, setting the start command as sleep 3600:

Plain Text

1apiVersion: apps/v1
2kind: Deployment
3metadata:
4  name: nginx-deployment
5  labels:
6    app: nginx
7spec:
8  replicas: 1
9  selector:
10    matchLabels:
11      app: nginx
12  template:
13    metadata:
14      labels:
15        app: nginx
16    spec:
17      containers:
18      - name: nginx
19        image: hub.baidubce.com/cce/nginx-alpine-go:latest
20        command: ["/bin/sh", "-c", "sleep 3600"]

Once the service is running, use the command kubectl exec -it podName /bin/sh to enter the container and manually run the start command to check for service error messages.

2. Does the creation of LoadBalancer Service fail?

Use the command kubectl describe service serviceName to view events and troubleshoot issues. Typically, the cause is quota limits for EIP or BLB. Submit a ticket to request quota increases if needed.

Note: The count of EIP instances which can be purchased by the users <=current count of existing BCC instances + current number of existing BLB instances+2

3. Does the container network access fail?

The container network access fails in the following conditions:

Service EIP is inaccessible;
ServiceName is inaccessible within the container;
Service ClusterIP is inaccessible within the cluster;
PodIP is inaccessible within the cluster;
...

Container network issues often stem from PodIP blockage, leading to access problems. First, verify if the PodIP can be reached by pinging it from nodes and pods. If not, investigate the following:

View the VPC route table, and confirm whether any routing rules conflict with CCE;
Check the VPC security group policies to ensure no rules are blocking the requests.

If the issue persists, submit a ticket to contact an administrator for further troubleshooting.

Note: If the service clusterIP ping is successful, use ip:port for access. Additionally, ensure the PodIP ping is reachable.

CCE Best Practice-Container Network Mode Selection

VPC-ENI Mode Cluster Public Network Access Practice

CCE CCE