EFK日志采集系统部署指南
EFK日志采集系统简介
EFK即ElasticSearch+Fluentd+Kibana。通过Fluentd在各个节点进行日志的采集汇聚到ElasticSearch中,由Kibana作前端展示。
- Elasticsearch 是一个分布式的搜索和分析引擎,可以用于全文检索、结构化检索和分析,并能将这三者结合起来。Elasticsearch 基于 Lucene 开发,现在是使用最广的开源搜索引擎之一,Wikipedia、Stack Overflow、GitHub 等都基于 Elasticsearch 来构建他们的搜索引擎。
- Fluentd是一个优秀的log信息收集的开源免费软件,目前已支持超过125种系统的log信息获取。Fluentd结合其他数据处理平台的使用,可以搭建大数据收集和处理平台,搭建商业化的解决方案。
- Kibana是一个开源的分析与可视化平台,设计出来用于和Elasticsearch一起使用的。你可以用kibana搜索、查看、交互存放在Elasticsearch索引里的数据,使用各种不同的图表、表格、地图等kibana能够很轻易地展示高级数据分析与可视化。
部署前准备工作
为了顺利完成EFK日志采集系统在CCE服务提供的Kubernetes集群部署,我们首先需要完成一些前置工作:
- 用户需要在CCE上拥有一个已经完成初始化的Kubernetes集群
- 用户已经根据指导文档能够通过kubectl正常访问集群。
创建ElasticSearch以及Fluentd用户
执行以下命令:
1$ kubectl create -f es-rbac.yaml
2$ kubectl create -f fluentd-es-rbac.yaml
注意: 用户在使用
es-rbac.yaml和fluentd-es-rbac.yaml之前,请先确认一下集群版本号,不同版本号使用的yaml文件不同。
集群版本号为1.6的用户可以使用的es-rbac.yaml文件如下:
1apiVersion: v1
2kind: ServiceAccount
3metadata:
4 name: elasticsearch
5 namespace: kube-system
6---
7kind: ClusterRoleBinding
8apiVersion: rbac.authorization.k8s.io/v1alpha1
9metadata:
10 name: elasticsearch
11subjects:
12 - kind: ServiceAccount
13 name: elasticsearch
14 namespace: kube-system
15roleRef:
16 kind: ClusterRole
17 name: view
18 apiGroup: rbac.authorization.k8s.io
集群版本号为1.8的用户可以使用的es-rbac.yaml文件如下:
1apiVersion: v1
2kind: ServiceAccount
3metadata:
4 name: elasticsearch
5 namespace: kube-system
6---
7kind: ClusterRoleBinding
8apiVersion: rbac.authorization.k8s.io/v1
9metadata:
10 name: elasticsearch
11subjects:
12 - kind: ServiceAccount
13 name: elasticsearch
14 namespace: kube-system
15roleRef:
16 kind: ClusterRole
17 name: view
18 apiGroup: rbac.authorization.k8s.io
集群版本号为1.6的用户可以使用的fluentd-es-rbac.yaml文件如下:
1apiVersion: v1
2kind: ServiceAccount
3metadata:
4 name: fluentd
5 namespace: kube-system
6---
7kind: ClusterRoleBinding
8apiVersion: rbac.authorization.k8s.io/v1alpha1
9metadata:
10 name: fluentd
11subjects:
12 - kind: ServiceAccount
13 name: fluentd
14 namespace: kube-system
15roleRef:
16 kind: ClusterRole
17 name: view
18 apiGroup: rbac.authorization.k8s.io
集群版本号为1.8的用户可以使用的fluentd-es-rbac.yaml文件如下:
1apiVersion: v1
2kind: ServiceAccount
3metadata:
4 name: fluentd
5 namespace: kube-system
6---
7kind: ClusterRoleBinding
8apiVersion: rbac.authorization.k8s.io/v1
9metadata:
10 name: fluentd
11subjects:
12 - kind: ServiceAccount
13 name: fluentd
14 namespace: kube-system
15roleRef:
16 kind: ClusterRole
17 name: view
18 apiGroup: rbac.authorization.k8s.io
部署Fluentd
DaemonSet fluentd-es-v1.22 只会调度到设置了标签 beta.kubernetes.io/fluentd-ds-ready=true 的 Node,需要在期望运行 fluentd 的 Node 上设置该标签;
1$ kubectl get nodes
2NAME STATUS AGE VERSION
3192.168.1.92 Ready 12d v1.8.6
4192.168.1.93 Ready 12d v1.8.6
5192.168.1.94 Ready 12d v1.8.6
6192.168.1.95 Ready 12d v1.8.6
7
8$ kubectl label nodes 192.168.1.92 192.168.1.93 192.168.1.94 192.168.1.95 beta.kubernetes.io/fluentd-ds-ready=true
9node "192.168.1.92" labeled
10node "192.168.1.93" labeled
11node "192.168.1.94" labeled
12node "192.168.1.95" labeled
打上标签以后执行对应的yaml文件启动fluentd,默认是在kube-system这个namespace下。
1$ kubectl create -f fluentd-es-ds.yaml
2daemonset "fluentd-es-v1.22" created
3
4$ kubectl get pods -n kube-system -o wide
5NAME READY STATUS RESTARTS AGE IP NODE
6fluentd-es-v1.22-07kls 1/1 Running 0 10s 172.18.4.187 192.168.1.94
7fluentd-es-v1.22-4np74 1/1 Running 0 10s 172.18.2.162 192.168.1.93
8fluentd-es-v1.22-tbh5c 1/1 Running 0 10s 172.18.3.201 192.168.1.95
9fluentd-es-v1.22-wlgjb 1/1 Running 0 10s 172.18.1.187 192.168.1.92
对应的 fluentd-es-ds.yaml文件如下:
1apiVersion: extensions/v1beta1
2kind: DaemonSet
3metadata:
4 name: fluentd-es-v1.22
5 namespace: kube-system
6 labels:
7 k8s-app: fluentd-es
8 kubernetes.io/cluster-service: "true"
9 addonmanager.kubernetes.io/mode: Reconcile
10 version: v1.22
11spec:
12 template:
13 metadata:
14 labels:
15 k8s-app: fluentd-es
16 kubernetes.io/cluster-service: "true"
17 version: v1.22
18 # This annotation ensures that fluentd does not get evicted if the node
19 # supports critical pod annotation based priority scheme.
20 # Note that this does not guarantee admission on the nodes (#40573).
21 annotations:
22 scheduler.alpha.kubernetes.io/critical-pod: ''
23 spec:
24 serviceAccountName: fluentd
25 containers:
26 - name: fluentd-es
27 image: hub.baidubce.com/public/fluentd-elasticsearch:1.22
28 command:
29 - '/bin/sh'
30 - '-c'
31 - '/usr/sbin/td-agent 2>&1 >> /var/log/fluentd.log'
32 resources:
33 limits:
34 memory: 200Mi
35 requests:
36 cpu: 100m
37 memory: 200Mi
38 volumeMounts:
39 - name: varlog
40 mountPath: /var/log
41 - name: varlibdockercontainers
42 mountPath: /var/lib/docker/containers
43 readOnly: true
44 nodeSelector:
45 beta.kubernetes.io/fluentd-ds-ready: "true"
46 tolerations:
47 - key : "node.alpha.kubernetes.io/ismaster"
48 effect: "NoSchedule"
49 terminationGracePeriodSeconds: 30
50 volumes:
51 - name: varlog
52 hostPath:
53 path: /var/log
54 - name: varlibdockercontainers
55 hostPath:
56 path: /var/lib/docker/containers
fluentd启动后可以至对应节点下的/var/log/fluent.log文件查看fluentd的日志有无异常,如果出现类似于unreadable之类的错误检查fluentd-es-ds.yaml挂载的目录是否完全。fluentd会从挂载的目录中采集日志,如果某个日志文件只是软链,需要挂载最初的日志文件目录位置。
部署ElasticSearch服务
首先创建相应的service用于访问elasticsearch
1$kubectl create -f es-service.yaml
2service "elasticsearch-logging" created
3
4$kubectl get svc -n kube-system
5NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
6elasticsearch-logging 172.16.215.15 <none> 9200/TCP 1m
对应的es-service.yaml文件如下:
1apiVersion: v1
2kind: Service
3metadata:
4 name: elasticsearch-logging
5 namespace: kube-system
6 labels:
7 k8s-app: elasticsearch-logging
8 kubernetes.io/cluster-service: "true"
9 addonmanager.kubernetes.io/mode: Reconcile
10 kubernetes.io/name: "Elasticsearch"
11spec:
12 ports:
13 - port: 9200
14 protocol: TCP
15 targetPort: db
16 selector:
17 k8s-app: elasticsearch-logging
启动elsaticsearch服务,通过curl CLUSTER-IP+PORT可以判断elasticsearch服务是否正常启动。
1$kubectl create -f es-controller.yaml
2replicationcontroller "elasticsearch-logging-v1" created
3
4$kubectl get pods -n kube-system -o wide
5NAME READY STATUS RESTARTS AGE IP NODE
6elasticsearch-logging-v1-0kll0 1/1 Running 0 43s 172.18.2.164 192.168.1.93
7elasticsearch-logging-v1-vh17k 1/1 Running 0 43s 172.18.1.189 192.168.1.92
8
9$curl 172.16.215.15:9200
10{
11 "name" : "elasticsearch-logging-v1-vh17k",
12 "cluster_name" : "kubernetes-logging",
13 "cluster_uuid" : "cjvE3LJjTvic8TGCbbKxZg",
14 "version" : {
15 "number" : "2.4.1",
16 "build_hash" : "c67dc32e24162035d18d6fe1e952c4cbcbe79d16",
17 "build_timestamp" : "2016-09-27T18:57:55Z",
18 "build_snapshot" : false,
19 "lucene_version" : "5.5.2"
20 },
21 "tagline" : "You Know, for Search"
22}
对应的es-controller.yaml文件如下:
1apiVersion: v1
2kind: ReplicationController
3metadata:
4 name: elasticsearch-logging-v1
5 namespace: kube-system
6 labels:
7 k8s-app: elasticsearch-logging
8 version: v1
9 kubernetes.io/cluster-service: "true"
10 addonmanager.kubernetes.io/mode: Reconcile
11spec:
12 replicas: 2
13 selector:
14 k8s-app: elasticsearch-logging
15 version: v1
16 template:
17 metadata:
18 labels:
19 k8s-app: elasticsearch-logging
20 version: v1
21 kubernetes.io/cluster-service: "true"
22 spec:
23 serviceAccountName: elasticsearch
24 containers:
25 - image: hub.baidubce.com/public/elasticsearch:v2.4.1-1
26 name: elasticsearch-logging
27 resources:
28 # need more cpu upon initialization, therefore burstable class
29 limits:
30 cpu: 1000m
31 requests:
32 cpu: 100m
33 ports:
34 - containerPort: 9200
35 name: db
36 protocol: TCP
37 - containerPort: 9300
38 name: transport
39 protocol: TCP
40 volumeMounts:
41 - name: es-persistent-storage
42 mountPath: /data
43 env:
44 - name: "NAMESPACE"
45 valueFrom:
46 fieldRef:
47 fieldPath: metadata.namespace
48 volumes:
49 - name: es-persistent-storage
50 emptyDir: {}
部署Kibana
1$kubectl create -f kibana-service.yaml
2service "kibana-logging" created
3
4$kubectl create -f kibana-controller.yaml
5deployment "kibana-logging" created
6
7$kubectl get pods -n kube-system -o wide
8NAME READY STATUS RESTARTS AGE IP NODE
9kibana-logging-1043852375-wrq6g 1/1 Running 0 48s 172.18.2.175 192.168.1.93
对应的kibana-service.yaml文件如下:
1apiVersion: v1
2kind: Service
3metadata:
4 name: kibana-logging
5 namespace: kube-system
6 labels:
7 k8s-app: kibana-logging
8 kubernetes.io/cluster-service: "true"
9 addonmanager.kubernetes.io/mode: Reconcile
10 kubernetes.io/name: "Kibana"
11spec:
12 ports:
13 - port: 80
14 protocol: TCP
15 targetPort: ui
16 selector:
17 k8s-app: kibana-logging
对应的kibana-controller.yaml文件如下
1apiVersion: extensions/v1beta1
2kind: Deployment
3metadata:
4 name: kibana-logging
5 namespace: kube-system
6 labels:
7 k8s-app: kibana-logging
8 kubernetes.io/cluster-service: "true"
9 addonmanager.kubernetes.io/mode: Reconcile
10spec:
11 replicas: 1
12 selector:
13 matchLabels:
14 k8s-app: kibana-logging
15 template:
16 metadata:
17 labels:
18 k8s-app: kibana-logging
19 spec:
20 containers:
21 - name: kibana-logging
22 image: hub.baidubce.com/public/kibana:v4.6.1-1
23 resources:
24 # keep request = limit to keep this container in guaranteed class
25 limits:
26 cpu: 100m
27 requests:
28 cpu: 100m
29 env:
30 - name: "ELASTICSEARCH_URL"
31 value: "http://elasticsearch-logging:9200"
32 - name: "KIBANA_BASE_URL"
33 value: ""
34 ports:
35 - containerPort: 5601
36 name: ui
37 protocol: TCP
kibana Pod 第一次启动时会用较长时间(10-20分钟)来优化和 Cache 状态页面,可以 tailf 该 Pod 的日志观察进度:
1$ kubectl logs kibana-logging-1043852375-wrq6g -n kube-system -f
2ELASTICSEARCH_URL=http://elasticsearch-logging:9200
3server.basePath: /api/v1/proxy/namespaces/kube-system/services/kibana-logging
4{"type":"log","@timestamp":"2017-12-04T09:54:41Z","tags":["info","optimize"],"pid":6,"message":"Optimizing and caching bundles for kibana and statusPage. This may take a few minutes"}
5{"type":"log","@timestamp":"2017-12-04T10:02:20Z","tags":["info","optimize"],"pid":6,"message":"Optimization of bundles for kibana and statusPage complete in 458.61 seconds"}
6{"type":"log","@timestamp":"2017-12-04T10:02:20Z","tags":["status","plugin:kibana@1.0.0","info"],"pid":6,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
访问 kibana
输入以下指令:
1$kubectl get svc -n kube-system
返回结果如下所示:
1NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
2kibana-logging LoadBalancer 172.16.60.222 180.76.112.7 80:32754/TCP 1m
用户可以通过LoadBalancer访问kibana服务,浏览器访问http://180.76.112.7即可,该ip地址为kibana-logging这个service的EXTERNAL-IP。
