在仪表盘上增加heapster指标

Posted by ShenHengheng on 2019-01-15

2019-01-15 16-47-57屏幕截图.png

Prerequisites: You have a kubernetes cluster with the dashboard plugin installed, see the article from Martin Jensen entitled Kubernetes dashboard on ARM with RBAC for the instructions on how to do that.

$ git clone 
$ cd heapster/

and edit the heapster.yaml and influxdb.yaml files to change the image architecture from -amd64 to -arm. For example image: k8s.gcr.io/heapster-amd64:v1.4.2 should be changed to image: k8s.gcr.io/heapster-arm:v1.4.2

$ kubectl create -f influxdb.yaml
$ kubectl create -f heapster.yaml

to deploy the heapster and influxeb deployment, service and serviceaccounts.

Now we need to add the roles from heapster/deploy/kube-config/rbac/.

$ cd rbac/
$ kubectl create -f heapster-rbac.yaml

If you go back to the kubernetes dashboard, you will not see any metrics. They are being collected but will not appear until we restart the dashboard.

$ kubectl delete -n kube-system kubernetes-dashboard-7fcc5cb979–85vt5

should take care of that.

Some wrong!!!


Bug solution: https://brookbach.com/2018/10/29/Heapster-on-Kubernetes-1.11.3.html

Heapster

Clone

First, clone the Heapster repository.

$ git clone https://github.com/kubernetes/heapster/
$ cd heapster

Then, set Grafana Service Type to NodePort and downgrade container version from 5.0.4 to 4.4.3 as follows. The reason why I downgrade the version is the dashboard on Grafana is not shown in 5.0.4.

$ diff --git a/deploy/kube-config/influxdb/grafana.yaml b/deploy/kube-config/influxdb/grafana.yaml
index 216bd9a..266f47a 100644
--- a/deploy/kube-config/influxdb/grafana.yaml
+++ b/deploy/kube-config/influxdb/grafana.yaml
@@ -13,7 +13,7 @@ spec:
     spec:
       containers:
       - name: grafana
-        image: k8s.gcr.io/heapster-grafana-amd64:v5.0.4
+        image: k8s.gcr.io/heapster-grafana-amd64:v4.4.3
         ports:
         - containerPort: 3000
           protocol: TCP
@@ -64,7 +64,7 @@ spec:
   # or through a public IP.
   # type: LoadBalancer
   # You could also use NodePort to expose the service at a randomly-generated port
-  # type: NodePort
+  type: NodePort
   ports:
   - port: 80
     targetPort: 3000

In this point, The pods are launched if I kubectl apply under deploy/kube-config/rbacand deploy/kube-config/influxdb , but the following error logs are generated and not worked correctly.

E1028 07:39:05.011439       1 manager.go:101] Error in scraping containers from Kubelet:XX.XX.XX.XX:10255: failed to get all container stats from Kubelet URL "http://XX.XX.XX.XX:10255/stats/container/": Post http://XX.XX.XX.XX:10255/stats/container/: dial tcp XX.XX.XX.XX:10255:
 getsockopt: connection refused

Apply patch

After googling the error message, I found this issue.

It looks the port to access are changed and I need to deal with HTTPS.

So I edit deploy/kube-config/influxdb/heapster.yaml and deploy/kube-config/rbac/heapster-rbac.yaml as below.

diff --git a/deploy/kube-config/influxdb/heapster.yaml b/deploy/kube-config/influxdb/heapster.yaml
index e820ca5..195061a 100644
--- a/deploy/kube-config/influxdb/heapster.yaml
+++ b/deploy/kube-config/influxdb/heapster.yaml
@@ -24,7 +24,7 @@ spec:
         imagePullPolicy: IfNotPresent
         command:
         - /heapster
-        - --source=kubernetes:https://kubernetes.default
+        - --source=kubernetes.summary_api:''?useServiceAccount=true&kubeletHttps=true&kubeletPort=10250&insecure=true
         - --sink=influxdb:http://monitoring-influxdb.kube-system.svc:8086
 ---
 apiVersion: v1
diff --git a/deploy/kube-config/rbac/heapster-rbac.yaml b/deploy/kube-config/rbac/heapster-rbac.yaml
index 6e63803..1f982fb 100644
--- a/deploy/kube-config/rbac/heapster-rbac.yaml
+++ b/deploy/kube-config/rbac/heapster-rbac.yaml
@@ -5,7 +5,7 @@ metadata:
 roleRef:
   apiGroup: rbac.authorization.k8s.io
   kind: ClusterRole
-  name: system:heapster
+  name: heapster
 subjects:
 - kind: ServiceAccount
   name: heapster

Moreover, I create deploy/kube-config/rbac/heapster-role.yaml.

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: heapster
rules:
- apiGroups:
  - ""
  resources:
  - pods
  - nodes
  - namespaces
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - extensions
  resources:
  - deployments
  verbs:
  - get
  - list
  - update
  - watch
- apiGroups:
  - ""
  resources:
  - nodes/stats
  verbs:
  - get

Create pods

Finally, I apply these configurations.

$ kubectl create -f ./deploy/kube-config/rbac/
clusterrolebinding.rbac.authorization.k8s.io/heapster created
clusterrole.rbac.authorization.k8s.io/heapster created
$ kubectl create -f ./deploy/kube-config/influxdb/
deployment.extensions/monitoring-grafana created
service/monitoring-grafana created
serviceaccount/heapster created
deployment.extensions/heapster created
service/heapster created
deployment.extensions/monitoring-influxdb created
service/monitoring-influxdb created

After updating, I can configure launching Heapster, Grafana, and Influx DB by kubectl get all -n kube-system. And after waiting for minutes, I can see the metrics by kubectl top node.

Thought

When I faced to some miss behavior on Kubernetes, I often find a clue of solution by seeing the log by kubectl logs.