EFK Log Collection System Deployment Guide

CCE CCE

  • Function Release Records
  • Common Tools
    • Command Line Scenario Examples
  • API Reference
    • Overview
    • Common Headers and Error Responses
    • General Description
  • Product Announcement
    • Announcement on the Discontinuation of CCE Standalone Clusters
    • CCE New Cluster Management Release Announcement
    • Upgrade Announcement for CCE Cluster Audit Component kube-external-auditor
    • CCE Console Upgrade Announcement
    • Announcement on Management Fees for CCE Managed Clusters
    • Container Runtime Version Release Notes
    • Announcement on the Decommissioning of CCE Image Repository
    • Kubernetes Version Release Notes
      • CCE Release of Kubernetes v1_26 History
      • CCE Kubernetes Version Update Notes
      • CCE Release of Kubernetes v1_24 History
      • CCE Release of Kubernetes v1_30 History
      • CCE Release of Kubernetes v1_22 History
      • CCE Release of Kubernetes v1_18 History
      • CCE Release of Kubernetes v1_20 History
      • CCE Release of Kubernetes v1_28 History
      • Release Notes for CCE Kubernetes 1_31 Version
      • Kubernetes Version Overview and Mechanism
    • Security Vulnerability Fix Announcement
      • Vulnerability CVE-2019-5736 Fix Announcement
      • Vulnerability CVE-2021-30465 Fix Announcement
      • CVE-2025-1097, CVE-2025-1098, and Other Vulnerabilities Fix Announcement
      • CVE-2020-14386 Vulnerability Fix Announcement
      • Impact Statement on runc Security Issue (CVE-2024-21626)
  • Service Level Agreement (SLA)
    • CCE Service Level Agreement SLA (V1_0)
  • Typical Practices
    • Pod Anomaly Troubleshooting
    • Adding CGroup V2 Node
    • Common Linux System Configuration Parameters Description
    • Encrypting etcd Data Using KMS
    • Configuring Container Network Parameters Using CNI
    • CCE - Public Network Access Practice
    • Practice of using private images in CCE clusters
    • Unified Access for Virtual Machines and Container Services via CCE Ingress
    • User Guide for Custom CNI Plugins
    • CCE Cluster Network Description and Planning
    • Cross-Cloud Application Migration to Baidu CCE Using Velero
    • CCE Resource Recommender User Documentation
    • Continuous Deployment with Jenkins in CCE Cluster
    • CCE Best Practice-Guestbook Setup
    • CCE Best Practice-Container Network Mode Selection
    • CCE Usage Checklist
    • VPC-ENI Mode Cluster Public Network Access Practice
    • CCE Container Runtime Selection
    • Cloud-native AI
      • Elastic and Fault-Tolerant Training Using CCE AITraining Operator
      • Deploy the TensorFlow Serving inference service
      • Best Practice for GPU Virtualization with Optimal Isolation
  • FAQs
    • How do business applications use load balancer
    • Using kubectl on Windows
    • Cluster management FAQs
    • Common Questions Overview
    • Auto scaling FAQs
    • Create a simple service via kubectl
  • Operation guide
    • Prerequisites for use
    • Identity and access management
    • Permission Management
      • Configure IAM Tag Permission Policy
      • Permission Overview
      • Configure IAM Custom Permission Policy
      • Configure Predefined RBAC Permission Policy
      • Configure IAM Predefined Permission Policy
      • Configure Cluster OIDC Authentication
    • Configuration Management
      • Configmap Management
      • Secret Management
    • Traffic access
      • BLB ingress annotation description
      • Use K8S_Service via CCE
      • Use K8S_Ingress via CCE
      • Implement Canary Release with CCE Based on Nginx-Ingress
      • Create CCE_Ingress via YAML
      • LoadBalancer Service Annotation Description
      • Service Reuses Existing Load Balancer BLB
      • Use Direct Pod Mode LoadBalancer Service
      • NGINX Ingress Configuration Reference
      • Create LoadBalancer_Service via YAML
      • Use NGINX Ingress
    • Virtual Node
      • Configuring BCIPod
      • Configuring bci-profile
      • Managing virtual nodes
    • Node management
      • Add a node
      • Managing Taints
      • Setting Node Blocking
      • Setting GPU Memory Sharing
      • Remove a node
      • Customizing Kubelet Parameters
      • Kubelet Container Monitor Read-Only Port Risk Warning
      • Managing Node Tag
      • Drain node
    • Component Management
      • CCE CSI CDS Plugin Description
      • CCE Fluid Description
      • CCE CSI PFS L2 Plugin
      • CCE Calico Felix Description
      • CCE Ingress Controller Description
      • CCE QoS Agent Description
      • CCE GPU Manager Description
      • CCE Ingress NGINX Controller Description
      • CCE P2P Accelerator Description
      • CCE Virtual Kubelet Component
      • CoreDNS Description
      • CCE Log Operator Description
      • CCE Node Remedier Description
      • CCE Descheduler Description
      • CCE Dynamic Scheduling Plugin Description
      • Kube Scheduler Documentation
      • CCE NPU Manager Description
      • CCE CronHPA Controller Description
      • CCE LB Controller Description
      • Kube ApiServer Description
      • CCE Backup Controller Description
      • CCE Network Plugin Description
      • CCE CSI PFS Plugin Description
      • CCE Credential Controller Description
      • CCE Deep Learning Frameworks Operator Description
      • Component Overview
      • CCE Image Accelerate Description
      • CCE CSI BOS Plugin Description
      • CCE Onepilot Description
      • Description of Kube Controller Manager
      • CCE_Hybrid_Manager Description
      • CCE NodeLocal DNSCache Description
      • CCE Node Problem Detector Description
      • CCE Ascend Mindx DL Description
      • CCE RDMA Device Plugin Description
      • CCE AI Job Scheduler Description
    • Image registry
      • Image Registry Basic Operations
      • Using Container Image to Build Services
    • Helm Management
      • Helm Template
      • Helm Instance
    • Cluster management
      • Upgrade Cluster Kubernetes Version
      • CCE Node CDS Dilatation
      • Managed Cluster Usage Instructions
      • Create cluster
      • CCE Supports GPUSharing Cluster
      • View Cluster
      • Connect to Cluster via kubectl
      • CCE Security Group
      • CCE Node Resource Reservation Instructions
      • Operate Cluster
      • Cluster Snapshot
    • Serverless Cluster
      • Product overview
      • Using Service in Serverless Cluster
      • Creating a Serverless Cluster
    • Storage Management
      • Using Cloud File System
      • Overview
      • Using Parallel File System PFS
      • Using RapidFS
      • Using Object Storage BOS
      • Using Parallel File System PFS L2
      • Using Local Storage
      • Using Cloud Disk CDS
    • Inspection and Diagnosis
      • Cluster Inspection
      • GPU Runtime Environment Check
      • Fault Diagnosis
    • Cloud-native AI
      • Cloud-Native AI Overview
      • AI Monitoring Dashboard
        • Connecting to a Prometheus Instance and Starting a Job
        • NVIDIA Chip Resource Observation
          • AI Job Scheduler component
          • GPU node resources
          • GPU workload resources
          • GPUManager component
          • GPU resource pool overview
        • Ascend Chip Resource Observation
          • Ascend resource pool overview
          • Ascend node resource
          • Ascend workload resource
      • Task Management
        • View Task Information
        • Create TensorFlow Task
        • Example of RDMA Distributed Training Based on NCCL
        • Create PaddlePaddle Task
        • Create AI Training Task
        • Delete task
        • Create PyTorch Task
        • Create Mxnet Task
      • Queue Management
        • Modify Queue
        • Create Queue
        • Usage Instructions for Logical Queues and Physical Queues
        • Queue deletion
      • Dataset Management
        • Create Dataset
        • Delete dataset
        • View Dataset
        • Operate Dataset
      • AI Acceleration Kit
        • AIAK Introduction
        • Using AIAK-Training PyTorch Edition
        • Deploying Distributed Training Tasks Using AIAK-Training
        • Accelerating Inference Business Using AIAK-Inference
      • GPU Virtualization
        • GPU Exclusive and Shared Usage Instructions
        • Image Build Precautions in Shared GPU Scenarios
        • Instructions for Multi-GPU Usage in Single-GPU Containers
        • GPU Virtualization Adaptation Table
        • GPU Online and Offline Mixed Usage Instructions
        • MPS Best Practices & Precautions
        • Precautions for Disabling Node Video Memory Sharing
    • Elastic Scaling
      • Container Timing Horizontal Scaling (CronHPA)
      • Container Horizontal Scaling (HPA)
      • Implementing Second-Level Elastic Scaling with cce-autoscaling-placeholder
      • CCE Cluster Node Auto-Scaling
    • Network Management
      • How to Continue Dilatation When Container Network Segment Space Is Exhausted (VPC-ENI Mode)
      • Container Access to External Services in CCE Clusters
      • CCE supports dual-stack networks of IPv4 and IPv6
      • Using NetworkPolicy Network Policy
      • Traffic Forwarding Configuration for Containers in Peering Connections Scenarios
      • CCE IP Masquerade Agent User Guide
      • Creating VPC-ENI Mode Cluster
      • How to Continue Dilatation When Container Network Segment Space Is Exhausted (VPC Network Mode)
      • Using NetworkPolicy in CCE Clusters
      • Network Orchestration
        • Container Network QoS Management
        • VPC-ENI Specified Subnet IP Allocation (Container Network v2)
        • Cluster Pod Subnet Topology Distribution (Container Network v2)
      • Network Connectivity
        • Container network accesses the public network via NAT gateway
      • Network Maintenance
        • Common Error Code Table for CCE Container Network
      • DNS
        • CoreDNS Component Manual Dilatation Guide
        • DNS Troubleshooting Guide
        • DNS Principle Overview
    • Namespace Management
      • Set Limit Range
      • Set Resource Quota
      • Basic Namespace Operations
    • Workload
      • CronJob Management
      • Set Workload Auto-Scaling
      • Deployment Management
      • Job Management
      • View the Pod
      • StatefulSet Management
      • Password-Free Pull of Container Image
      • Create Workload Using Private Image
      • DaemonSet Management
    • Monitor Logs
      • Monitor Cluster with Prometheus
      • CCE Event Center
      • Cluster Service Profiling
      • CCE Cluster Anomaly Event Alerts
      • Java Application Monitor
      • Cluster Audit Dashboard
      • Logging
      • Cluster Audit
      • Log Center
        • Configure Collection Rules Using CRD
        • View Cluster Control Plane Logs
        • View Business Logs
        • Log Overview
        • Configure Collection Rules in Cloud Container Engine Console
    • Application management
      • Overview
      • Secret
      • Configuration dictionary
      • Deployment
      • Service
      • Pod
    • NodeGroup Management
      • NodeGroup Management
      • NodeGroup Node Fault Detection and Self-Healing
      • Configuring Scaling Policies
      • NodeGroup Introduction
      • Adding Existing External Nodes
      • Custom NodeGroup Kubelet Configuration
      • Adding Alternative Models
      • Dilatation NodeGroup
    • Backup Center
      • Restore Management
      • Backup Overview
      • Backup Management
      • Backup repository
  • Quick Start
    • Quick Deployment of Nginx Application
    • CCE Container Engine Usage Process Overview
  • Product pricing
    • Product pricing
  • Product Description
    • Application scenarios
    • Introduction
    • Usage restrictions
    • Features
    • Advantages
    • Core concepts
  • Solution-Fabric
    • Fabric Solution
  • Development Guide
    • EFK Log Collection System Deployment Guide
    • Using Network Policy in CCE Cluster
    • Creating a LoadBalancer-Type Service
    • Prometheus Monitoring System Deployment Guide
    • kubectl Management Configuration
  • API_V2 Reference
    • Overview
    • Common Headers and Error Responses
    • Cluster Related Interfaces
    • Instance Related Interfaces
    • Service domain
    • General Description
    • Kubeconfig Related Interfaces
    • RBAC Related Interfaces
    • Autoscaler Related Interfaces
    • Network Related Interfaces
    • InstanceGroup Related Interfaces
    • Appendix
    • Component management-related APIs
    • Package adaptation-related APIs
    • Task Related Interfaces
  • Solution-Xchain
    • Hyperchain Solution
  • SDK
    • Go-SDK
      • Overview
      • NodeGroup Management
      • Initialization
      • Install the SDK Package
      • Cluster management
      • Node management
All documents
menu
No results found, please re-enter

CCE CCE

  • Function Release Records
  • Common Tools
    • Command Line Scenario Examples
  • API Reference
    • Overview
    • Common Headers and Error Responses
    • General Description
  • Product Announcement
    • Announcement on the Discontinuation of CCE Standalone Clusters
    • CCE New Cluster Management Release Announcement
    • Upgrade Announcement for CCE Cluster Audit Component kube-external-auditor
    • CCE Console Upgrade Announcement
    • Announcement on Management Fees for CCE Managed Clusters
    • Container Runtime Version Release Notes
    • Announcement on the Decommissioning of CCE Image Repository
    • Kubernetes Version Release Notes
      • CCE Release of Kubernetes v1_26 History
      • CCE Kubernetes Version Update Notes
      • CCE Release of Kubernetes v1_24 History
      • CCE Release of Kubernetes v1_30 History
      • CCE Release of Kubernetes v1_22 History
      • CCE Release of Kubernetes v1_18 History
      • CCE Release of Kubernetes v1_20 History
      • CCE Release of Kubernetes v1_28 History
      • Release Notes for CCE Kubernetes 1_31 Version
      • Kubernetes Version Overview and Mechanism
    • Security Vulnerability Fix Announcement
      • Vulnerability CVE-2019-5736 Fix Announcement
      • Vulnerability CVE-2021-30465 Fix Announcement
      • CVE-2025-1097, CVE-2025-1098, and Other Vulnerabilities Fix Announcement
      • CVE-2020-14386 Vulnerability Fix Announcement
      • Impact Statement on runc Security Issue (CVE-2024-21626)
  • Service Level Agreement (SLA)
    • CCE Service Level Agreement SLA (V1_0)
  • Typical Practices
    • Pod Anomaly Troubleshooting
    • Adding CGroup V2 Node
    • Common Linux System Configuration Parameters Description
    • Encrypting etcd Data Using KMS
    • Configuring Container Network Parameters Using CNI
    • CCE - Public Network Access Practice
    • Practice of using private images in CCE clusters
    • Unified Access for Virtual Machines and Container Services via CCE Ingress
    • User Guide for Custom CNI Plugins
    • CCE Cluster Network Description and Planning
    • Cross-Cloud Application Migration to Baidu CCE Using Velero
    • CCE Resource Recommender User Documentation
    • Continuous Deployment with Jenkins in CCE Cluster
    • CCE Best Practice-Guestbook Setup
    • CCE Best Practice-Container Network Mode Selection
    • CCE Usage Checklist
    • VPC-ENI Mode Cluster Public Network Access Practice
    • CCE Container Runtime Selection
    • Cloud-native AI
      • Elastic and Fault-Tolerant Training Using CCE AITraining Operator
      • Deploy the TensorFlow Serving inference service
      • Best Practice for GPU Virtualization with Optimal Isolation
  • FAQs
    • How do business applications use load balancer
    • Using kubectl on Windows
    • Cluster management FAQs
    • Common Questions Overview
    • Auto scaling FAQs
    • Create a simple service via kubectl
  • Operation guide
    • Prerequisites for use
    • Identity and access management
    • Permission Management
      • Configure IAM Tag Permission Policy
      • Permission Overview
      • Configure IAM Custom Permission Policy
      • Configure Predefined RBAC Permission Policy
      • Configure IAM Predefined Permission Policy
      • Configure Cluster OIDC Authentication
    • Configuration Management
      • Configmap Management
      • Secret Management
    • Traffic access
      • BLB ingress annotation description
      • Use K8S_Service via CCE
      • Use K8S_Ingress via CCE
      • Implement Canary Release with CCE Based on Nginx-Ingress
      • Create CCE_Ingress via YAML
      • LoadBalancer Service Annotation Description
      • Service Reuses Existing Load Balancer BLB
      • Use Direct Pod Mode LoadBalancer Service
      • NGINX Ingress Configuration Reference
      • Create LoadBalancer_Service via YAML
      • Use NGINX Ingress
    • Virtual Node
      • Configuring BCIPod
      • Configuring bci-profile
      • Managing virtual nodes
    • Node management
      • Add a node
      • Managing Taints
      • Setting Node Blocking
      • Setting GPU Memory Sharing
      • Remove a node
      • Customizing Kubelet Parameters
      • Kubelet Container Monitor Read-Only Port Risk Warning
      • Managing Node Tag
      • Drain node
    • Component Management
      • CCE CSI CDS Plugin Description
      • CCE Fluid Description
      • CCE CSI PFS L2 Plugin
      • CCE Calico Felix Description
      • CCE Ingress Controller Description
      • CCE QoS Agent Description
      • CCE GPU Manager Description
      • CCE Ingress NGINX Controller Description
      • CCE P2P Accelerator Description
      • CCE Virtual Kubelet Component
      • CoreDNS Description
      • CCE Log Operator Description
      • CCE Node Remedier Description
      • CCE Descheduler Description
      • CCE Dynamic Scheduling Plugin Description
      • Kube Scheduler Documentation
      • CCE NPU Manager Description
      • CCE CronHPA Controller Description
      • CCE LB Controller Description
      • Kube ApiServer Description
      • CCE Backup Controller Description
      • CCE Network Plugin Description
      • CCE CSI PFS Plugin Description
      • CCE Credential Controller Description
      • CCE Deep Learning Frameworks Operator Description
      • Component Overview
      • CCE Image Accelerate Description
      • CCE CSI BOS Plugin Description
      • CCE Onepilot Description
      • Description of Kube Controller Manager
      • CCE_Hybrid_Manager Description
      • CCE NodeLocal DNSCache Description
      • CCE Node Problem Detector Description
      • CCE Ascend Mindx DL Description
      • CCE RDMA Device Plugin Description
      • CCE AI Job Scheduler Description
    • Image registry
      • Image Registry Basic Operations
      • Using Container Image to Build Services
    • Helm Management
      • Helm Template
      • Helm Instance
    • Cluster management
      • Upgrade Cluster Kubernetes Version
      • CCE Node CDS Dilatation
      • Managed Cluster Usage Instructions
      • Create cluster
      • CCE Supports GPUSharing Cluster
      • View Cluster
      • Connect to Cluster via kubectl
      • CCE Security Group
      • CCE Node Resource Reservation Instructions
      • Operate Cluster
      • Cluster Snapshot
    • Serverless Cluster
      • Product overview
      • Using Service in Serverless Cluster
      • Creating a Serverless Cluster
    • Storage Management
      • Using Cloud File System
      • Overview
      • Using Parallel File System PFS
      • Using RapidFS
      • Using Object Storage BOS
      • Using Parallel File System PFS L2
      • Using Local Storage
      • Using Cloud Disk CDS
    • Inspection and Diagnosis
      • Cluster Inspection
      • GPU Runtime Environment Check
      • Fault Diagnosis
    • Cloud-native AI
      • Cloud-Native AI Overview
      • AI Monitoring Dashboard
        • Connecting to a Prometheus Instance and Starting a Job
        • NVIDIA Chip Resource Observation
          • AI Job Scheduler component
          • GPU node resources
          • GPU workload resources
          • GPUManager component
          • GPU resource pool overview
        • Ascend Chip Resource Observation
          • Ascend resource pool overview
          • Ascend node resource
          • Ascend workload resource
      • Task Management
        • View Task Information
        • Create TensorFlow Task
        • Example of RDMA Distributed Training Based on NCCL
        • Create PaddlePaddle Task
        • Create AI Training Task
        • Delete task
        • Create PyTorch Task
        • Create Mxnet Task
      • Queue Management
        • Modify Queue
        • Create Queue
        • Usage Instructions for Logical Queues and Physical Queues
        • Queue deletion
      • Dataset Management
        • Create Dataset
        • Delete dataset
        • View Dataset
        • Operate Dataset
      • AI Acceleration Kit
        • AIAK Introduction
        • Using AIAK-Training PyTorch Edition
        • Deploying Distributed Training Tasks Using AIAK-Training
        • Accelerating Inference Business Using AIAK-Inference
      • GPU Virtualization
        • GPU Exclusive and Shared Usage Instructions
        • Image Build Precautions in Shared GPU Scenarios
        • Instructions for Multi-GPU Usage in Single-GPU Containers
        • GPU Virtualization Adaptation Table
        • GPU Online and Offline Mixed Usage Instructions
        • MPS Best Practices & Precautions
        • Precautions for Disabling Node Video Memory Sharing
    • Elastic Scaling
      • Container Timing Horizontal Scaling (CronHPA)
      • Container Horizontal Scaling (HPA)
      • Implementing Second-Level Elastic Scaling with cce-autoscaling-placeholder
      • CCE Cluster Node Auto-Scaling
    • Network Management
      • How to Continue Dilatation When Container Network Segment Space Is Exhausted (VPC-ENI Mode)
      • Container Access to External Services in CCE Clusters
      • CCE supports dual-stack networks of IPv4 and IPv6
      • Using NetworkPolicy Network Policy
      • Traffic Forwarding Configuration for Containers in Peering Connections Scenarios
      • CCE IP Masquerade Agent User Guide
      • Creating VPC-ENI Mode Cluster
      • How to Continue Dilatation When Container Network Segment Space Is Exhausted (VPC Network Mode)
      • Using NetworkPolicy in CCE Clusters
      • Network Orchestration
        • Container Network QoS Management
        • VPC-ENI Specified Subnet IP Allocation (Container Network v2)
        • Cluster Pod Subnet Topology Distribution (Container Network v2)
      • Network Connectivity
        • Container network accesses the public network via NAT gateway
      • Network Maintenance
        • Common Error Code Table for CCE Container Network
      • DNS
        • CoreDNS Component Manual Dilatation Guide
        • DNS Troubleshooting Guide
        • DNS Principle Overview
    • Namespace Management
      • Set Limit Range
      • Set Resource Quota
      • Basic Namespace Operations
    • Workload
      • CronJob Management
      • Set Workload Auto-Scaling
      • Deployment Management
      • Job Management
      • View the Pod
      • StatefulSet Management
      • Password-Free Pull of Container Image
      • Create Workload Using Private Image
      • DaemonSet Management
    • Monitor Logs
      • Monitor Cluster with Prometheus
      • CCE Event Center
      • Cluster Service Profiling
      • CCE Cluster Anomaly Event Alerts
      • Java Application Monitor
      • Cluster Audit Dashboard
      • Logging
      • Cluster Audit
      • Log Center
        • Configure Collection Rules Using CRD
        • View Cluster Control Plane Logs
        • View Business Logs
        • Log Overview
        • Configure Collection Rules in Cloud Container Engine Console
    • Application management
      • Overview
      • Secret
      • Configuration dictionary
      • Deployment
      • Service
      • Pod
    • NodeGroup Management
      • NodeGroup Management
      • NodeGroup Node Fault Detection and Self-Healing
      • Configuring Scaling Policies
      • NodeGroup Introduction
      • Adding Existing External Nodes
      • Custom NodeGroup Kubelet Configuration
      • Adding Alternative Models
      • Dilatation NodeGroup
    • Backup Center
      • Restore Management
      • Backup Overview
      • Backup Management
      • Backup repository
  • Quick Start
    • Quick Deployment of Nginx Application
    • CCE Container Engine Usage Process Overview
  • Product pricing
    • Product pricing
  • Product Description
    • Application scenarios
    • Introduction
    • Usage restrictions
    • Features
    • Advantages
    • Core concepts
  • Solution-Fabric
    • Fabric Solution
  • Development Guide
    • EFK Log Collection System Deployment Guide
    • Using Network Policy in CCE Cluster
    • Creating a LoadBalancer-Type Service
    • Prometheus Monitoring System Deployment Guide
    • kubectl Management Configuration
  • API_V2 Reference
    • Overview
    • Common Headers and Error Responses
    • Cluster Related Interfaces
    • Instance Related Interfaces
    • Service domain
    • General Description
    • Kubeconfig Related Interfaces
    • RBAC Related Interfaces
    • Autoscaler Related Interfaces
    • Network Related Interfaces
    • InstanceGroup Related Interfaces
    • Appendix
    • Component management-related APIs
    • Package adaptation-related APIs
    • Task Related Interfaces
  • Solution-Xchain
    • Hyperchain Solution
  • SDK
    • Go-SDK
      • Overview
      • NodeGroup Management
      • Initialization
      • Install the SDK Package
      • Cluster management
      • Node management
  • Document center
  • arrow
  • CCECCE
  • arrow
  • Development Guide
  • arrow
  • EFK Log Collection System Deployment Guide
Table of contents on this page
  • Introduction to the EFK log collection system
  • Pre-deployment preparation
  • Create ElasticSearch and Fluentd users
  • Deploy Fluentd
  • Deploy the ElasticSearch service
  • Deploy Kibana
  • Access Kibana

EFK Log Collection System Deployment Guide

Updated at:2025-10-27

Introduction to the EFK log collection system

EFK refers to the combination of ElasticSearch + Fluentd + Kibana. Fluentd collects and aggregates logs on each node, sending them to ElasticSearch for storage, while Kibana provides a frontend platform for log visualization.

  • Elasticsearch: A distributed search and analytics engine supporting full-text search, structured search, and analytics, seamlessly integrating these capabilities. Built on Lucene, it is among the most popular open-source search engines and is utilized by platforms like Wikipedia, Stack Overflow, and GitHub for their search systems.
  • Fluentd: A robust, open-source log collection tool that's free to use. It supports over 125 types of log collection systems. When integrated with other data processing platforms, Fluentd facilitates the creation of scalable big data collection and processing infrastructures or commercial solutions.
  • Kibana: An open-source analytics and visualization platform designed for use with ElasticSearch. Kibana enables users to search, view, and explore data from ElasticSearch indexes. Its tools, including charts, tables, and maps, make it easy to display advanced analytics and visualizations.

Pre-deployment preparation

To successfully deploy the EFK log collection system in a Kubernetes cluster provided by the CCE service, complete the following prerequisites first:

  • You have an initialized Kubernetes cluster on CCE
  • You can access the cluster normally via kubectl according to the [guide document](CCE/Operation guide/Operation process.md).

Create ElasticSearch and Fluentd users

Execute the following command:

Plain Text
1$ kubectl create -f es-rbac.yaml
2$ kubectl create -f fluentd-es-rbac.yaml

Note: Before using the es-rbac.yaml and fluentd-es-rbac.yaml files, confirm your cluster version first - different cluster versions require different YAML files.

For clusters with version 1.6, the usable es-rbac.yaml file is as follows:

Plain Text
1apiVersion: v1
2kind: ServiceAccount
3metadata:
4  name: elasticsearch
5  namespace: kube-system
6---
7kind: ClusterRoleBinding
8apiVersion: rbac.authorization.k8s.io/v1alpha1
9metadata:
10  name: elasticsearch
11subjects:
12  - kind: ServiceAccount
13    name: elasticsearch
14    namespace: kube-system
15roleRef:
16  kind: ClusterRole
17  name: view
18  apiGroup: rbac.authorization.k8s.io

For clusters with version 1.8, the usable es-rbac.yaml file is as follows:

Plain Text
1apiVersion: v1
2kind: ServiceAccount
3metadata:
4  name: elasticsearch
5  namespace: kube-system
6---
7kind: ClusterRoleBinding
8apiVersion: rbac.authorization.k8s.io/v1
9metadata:
10  name: elasticsearch
11subjects:
12  - kind: ServiceAccount
13    name: elasticsearch
14    namespace: kube-system
15roleRef:
16  kind: ClusterRole
17  name: view
18  apiGroup: rbac.authorization.k8s.io

For clusters with version 1.6, the usable fluentd-es-rbac.yaml file is as follows:

Plain Text
1apiVersion: v1
2kind: ServiceAccount
3metadata:
4  name: fluentd
5  namespace: kube-system
6---
7kind: ClusterRoleBinding
8apiVersion: rbac.authorization.k8s.io/v1alpha1
9metadata:
10  name: fluentd
11subjects:
12  - kind: ServiceAccount
13    name: fluentd
14    namespace: kube-system
15roleRef:
16  kind: ClusterRole
17  name: view
18  apiGroup: rbac.authorization.k8s.io

For clusters with version 1.8, the usable fluentd-es-rbac.yaml file is as follows:

Plain Text
1apiVersion: v1
2kind: ServiceAccount
3metadata:
4  name: fluentd
5  namespace: kube-system
6---
7kind: ClusterRoleBinding
8apiVersion: rbac.authorization.k8s.io/v1
9metadata:
10  name: fluentd
11subjects:
12  - kind: ServiceAccount
13    name: fluentd
14    namespace: kube-system
15roleRef:
16  kind: ClusterRole
17  name: view
18  apiGroup: rbac.authorization.k8s.io

Deploy Fluentd

The DaemonSet fluentd-es-v1.22 will only be scheduled to nodes with the label beta.kubernetes.io/fluentd-ds-ready=true. You need to add this label to all nodes where you want Fluentd to run;

Plain Text
1$ kubectl get nodes
2NAME           STATUS    AGE       VERSION
3192.168.1.92   Ready     12d        v1.8.6
4192.168.1.93   Ready     12d        v1.8.6
5192.168.1.94   Ready     12d        v1.8.6
6192.168.1.95   Ready     12d        v1.8.6
7
8$ kubectl label nodes 192.168.1.92 192.168.1.93 192.168.1.94 192.168.1.95  beta.kubernetes.io/fluentd-ds-ready=true
9node "192.168.1.92" labeled
10node "192.168.1.93" labeled
11node "192.168.1.94" labeled
12node "192.168.1.95" labeled

After adding the label, execute the necessary YAML file to deploy Fluentd. By default, Fluentd is set up in the kube-system namespace.

Plain Text
1$ kubectl create -f fluentd-es-ds.yaml
2daemonset "fluentd-es-v1.22" created
3
4$ kubectl get pods -n kube-system -o wide
5NAME                        READY     STATUS    RESTARTS   AGE       IP             NODE
6fluentd-es-v1.22-07kls      1/1       Running   0          10s       172.18.4.187   192.168.1.94
7fluentd-es-v1.22-4np74      1/1       Running   0          10s       172.18.2.162   192.168.1.93
8fluentd-es-v1.22-tbh5c      1/1       Running   0          10s       172.18.3.201   192.168.1.95
9fluentd-es-v1.22-wlgjb      1/1       Running   0          10s       172.18.1.187   192.168.1.92

Corresponding fluentd-es-ds.yaml file:

Plain Text
1apiVersion: extensions/v1beta1
2kind: DaemonSet
3metadata:
4  name: fluentd-es-v1.22
5  namespace: kube-system
6  labels:
7    k8s-app: fluentd-es
8    kubernetes.io/cluster-service: "true"
9    addonmanager.kubernetes.io/mode: Reconcile
10    version: v1.22
11spec:
12  template:
13    metadata:
14      labels:
15        k8s-app: fluentd-es
16        kubernetes.io/cluster-service: "true"
17        version: v1.22
18      # This annotation ensures that fluentd does not get evicted if the node
19      # supports critical pod annotation based priority scheme.
20      # Note that this does not guarantee admission on the nodes (#40573).
21      annotations:
22        scheduler.alpha.kubernetes.io/critical-pod: ''
23    spec:
24      serviceAccountName: fluentd
25      containers:
26      - name: fluentd-es
27        image: hub.baidubce.com/public/fluentd-elasticsearch:1.22
28        command:
29          - '/bin/sh'
30          - '-c'
31          - '/usr/sbin/td-agent 2>&1 >> /var/log/fluentd.log'
32        resources:
33          limits:
34            memory: 200Mi
35          requests:
36            cpu: 100m
37            memory: 200Mi
38        volumeMounts:
39        - name: varlog
40          mountPath: /var/log
41        - name: varlibdockercontainers
42          mountPath: /var/lib/docker/containers
43          readOnly: true
44      nodeSelector:
45        beta.kubernetes.io/fluentd-ds-ready: "true"
46      tolerations:
47      - key : "node.alpha.kubernetes.io/ismaster"
48        effect: "NoSchedule"
49      terminationGracePeriodSeconds: 30
50      volumes:
51      - name: varlog
52        hostPath:
53          path: /var/log
54      - name: varlibdockercontainers
55        hostPath:
56          path: /var/lib/docker/containers

Once Fluentd is launched, inspect Fluentd logs in the /var/log/fluent.log file on the respective node for any anomalies. If errors like “unreadable” are present, confirm that the directories referenced in fluentd-es-ds.yaml are properly defined. Fluentd collects logs from these mounted directories. For symbolic link log files, ensure that the original log file's directory is mounted.

Deploy the ElasticSearch service

First, create a corresponding service to access ElasticSearch

Plain Text
1$kubectl create -f es-service.yaml
2service "elasticsearch-logging" created
3
4$kubectl get svc -n kube-system
5NAME                    CLUSTER-IP      EXTERNAL-IP   PORT(S)         AGE
6elasticsearch-logging   172.16.215.15   <none>        9200/TCP        1m

Corresponding es-service.yaml file:

Plain Text
1apiVersion: v1
2kind: Service
3metadata:
4  name: elasticsearch-logging
5  namespace: kube-system
6  labels:
7    k8s-app: elasticsearch-logging
8    kubernetes.io/cluster-service: "true"
9    addonmanager.kubernetes.io/mode: Reconcile
10    kubernetes.io/name: "Elasticsearch"
11spec:
12  ports:
13  - port: 9200
14    protocol: TCP
15    targetPort: db
16  selector:
17    k8s-app: elasticsearch-logging

Start the ElasticSearch service. To verify whether it's running correctly, execute the command curl CLUSTER-IP:PORT.

Plain Text
1$kubectl create -f es-controller.yaml
2replicationcontroller "elasticsearch-logging-v1" created
3
4$kubectl get pods -n kube-system -o wide
5NAME                             READY     STATUS    RESTARTS   AGE       IP             NODE
6elasticsearch-logging-v1-0kll0   1/1       Running   0          43s       172.18.2.164   192.168.1.93
7elasticsearch-logging-v1-vh17k   1/1       Running   0          43s       172.18.1.189   192.168.1.92
8
9$curl 172.16.215.15:9200
10{
11  "name" : "elasticsearch-logging-v1-vh17k",
12  "cluster_name" : "kubernetes-logging",
13  "cluster_uuid" : "cjvE3LJjTvic8TGCbbKxZg",
14  "version" : {
15    "number" : "2.4.1",
16    "build_hash" : "c67dc32e24162035d18d6fe1e952c4cbcbe79d16",
17    "build_timestamp" : "2016-09-27T18:57:55Z",
18    "build_snapshot" : false,
19    "lucene_version" : "5.5.2"
20  },
21  "tagline" : "You Know, for Search"
22}

Corresponding es-controller.yaml file:

Plain Text
1apiVersion: v1
2kind: ReplicationController
3metadata:
4  name: elasticsearch-logging-v1
5  namespace: kube-system
6  labels:
7    k8s-app: elasticsearch-logging
8    version: v1
9    kubernetes.io/cluster-service: "true"
10    addonmanager.kubernetes.io/mode: Reconcile
11spec:
12  replicas: 2
13  selector:
14    k8s-app: elasticsearch-logging
15    version: v1
16  template:
17    metadata:
18      labels:
19        k8s-app: elasticsearch-logging
20        version: v1
21        kubernetes.io/cluster-service: "true"
22    spec:
23      serviceAccountName: elasticsearch
24      containers:
25      - image: hub.baidubce.com/public/elasticsearch:v2.4.1-1
26        name: elasticsearch-logging
27        resources:
28          # need more cpu upon initialization, therefore burstable class
29          limits:
30            cpu: 1000m
31          requests:
32            cpu: 100m
33        ports:
34        - containerPort: 9200
35          name: db
36          protocol: TCP
37        - containerPort: 9300
38          name: transport
39          protocol: TCP
40        volumeMounts:
41        - name: es-persistent-storage
42          mountPath: /data
43        env:
44        - name: "NAMESPACE"
45          valueFrom:
46            fieldRef:
47              fieldPath: metadata.namespace
48      volumes:
49      - name: es-persistent-storage
50        emptyDir: {}

Deploy Kibana

Plain Text
1$kubectl create -f kibana-service.yaml
2service "kibana-logging" created
3
4$kubectl create -f kibana-controller.yaml
5deployment "kibana-logging" created
6
7$kubectl get pods -n kube-system -o wide
8NAME                              READY     STATUS    RESTARTS   AGE       IP             NODE
9kibana-logging-1043852375-wrq6g   1/1       Running   0          48s       172.18.2.175   192.168.1.93

Corresponding kibana-service.yaml file:

Plain Text
1apiVersion: v1
2kind: Service
3metadata:
4  name: kibana-logging
5  namespace: kube-system
6  labels:
7    k8s-app: kibana-logging
8    kubernetes.io/cluster-service: "true"
9    addonmanager.kubernetes.io/mode: Reconcile
10    kubernetes.io/name: "Kibana"
11spec:
12  ports:
13  - port: 80
14    protocol: TCP
15    targetPort: ui
16  selector:
17    k8s-app: kibana-logging

Corresponding kibana-controller.yaml file:

Plain Text
1apiVersion: extensions/v1beta1
2kind: Deployment
3metadata:
4  name: kibana-logging
5  namespace: kube-system
6  labels:
7    k8s-app: kibana-logging
8    kubernetes.io/cluster-service: "true"
9    addonmanager.kubernetes.io/mode: Reconcile
10spec:
11  replicas: 1
12  selector:
13    matchLabels:
14      k8s-app: kibana-logging
15  template:
16    metadata:
17      labels:
18        k8s-app: kibana-logging
19    spec:
20      containers:
21      - name: kibana-logging
22        image: hub.baidubce.com/public/kibana:v4.6.1-1
23        resources:
24          # keep request = limit to keep this container in guaranteed class
25          limits:
26            cpu: 100m
27          requests:
28            cpu: 100m
29        env:
30          - name: "ELASTICSEARCH_URL"
31            value: "http://elasticsearch-logging:9200"
32          - name: "KIBANA_BASE_URL"
33            value: ""
34        ports:
35        - containerPort: 5601
36          name: ui
37          protocol: TCP

When the Kibana pod starts for the first time, it takes 10-20 minutes to optimize and cache the status page. You can run the tailf command on the pod’s logs to monitor the progress:

Plain Text
1$ kubectl logs kibana-logging-1043852375-wrq6g -n kube-system -f
2ELASTICSEARCH_URL=http://elasticsearch-logging:9200
3server.basePath: /api/v1/proxy/namespaces/kube-system/services/kibana-logging
4{"type":"log","@timestamp":"2017-12-04T09:54:41Z","tags":["info","optimize"],"pid":6,"message":"Optimizing and caching bundles for kibana and statusPage. This may take a few minutes"}
5{"type":"log","@timestamp":"2017-12-04T10:02:20Z","tags":["info","optimize"],"pid":6,"message":"Optimization of bundles for kibana and statusPage complete in 458.61 seconds"}
6{"type":"log","@timestamp":"2017-12-04T10:02:20Z","tags":["status","plugin:kibana@1.0.0","info"],"pid":6,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}

Access Kibana

Execute the following command:

Plain Text
1$kubectl get svc -n kube-system

Expected output:

Plain Text
1NAME                    TYPE           CLUSTER-IP       EXTERNAL-IP    PORT(S)          AGE
2kibana-logging          LoadBalancer   172.16.60.222    180.76.112.7   80:32754/TCP   1m

Access Kibana via LoadBalancer: Open a browser and visit http://180.76.112.7 (this IP is the EXTERNAL-IP of the kibana-logging service).

Previous
Solution-Fabric
Next
Using Network Policy in CCE Cluster