The cluster management node (rancher-node-staging-rke2-mgmt1) has a high load, and high memory usage [1].
I'll increase its memory.
Furthermore, the overall cluster is in warning memory level for some time.
So, I'll add one more node to the staging cluster to try to ease that part too.
$ for i in $(seq 1 4); do kubectl --context archive-staging-rke2 drain rancher-node-staging-rke2-worker$i --ignore-daemonsets --delete-local-data --force; done
[2]
for i in $(seq 1 4); do kubectl --context archive-staging-rke2 uncordon rancher-node-staging-rke2-worker$i; done
I've created the new node worker5 (registered in the inventory too).
It's not registered in the cluster yet for some reason (i'll check a tad later why).
I've created the new node worker5 (registered in the inventory too).
It's not registered in the cluster yet for some reason (i'll check a tad later why).
There are issues logged [1] in the rancher-system-agent.service.
May 19 11:17:42 rancher-node-staging-rke2-worker5 rancher-system-agent[75612]: W0519 11:17:42.856867 75612 reflector.go:443] pkg/mod/github.com/rancher/client-go@v1.24.0-rancher1/tools/cache/reflector.go:168: watch of *v1.Secret ended with: an error on the server ("unable to decode an event from the watch stream: stream error: stream ID 5; INTERNAL_ERROR; received from peer") has prevented the request from succeedingMay 19 11:17:58 rancher-node-staging-rke2-worker5 rancher-system-agent[75612]: W0519 11:17:58.235808 75612 reflector.go:443] pkg/mod/github.com/rancher/client-go@v1.24.0-rancher1/tools/cache/reflector.go:168: watch of *v1.Secret ended with: an error on the server ("unable to decode an event from the watch stream: stream error: stream ID 9; INTERNAL_ERROR; received from peer") has prevented the request from succeedingMay 19 11:18:46 rancher-node-staging-rke2-worker5 rancher-system-agent[75612]: W0519 11:18:46.057552 75612 reflector.go:325] pkg/mod/github.com/rancher/client-go@v1.24.0-rancher1/tools/cache/reflector.go:168: failed to list *v1.Secret: an error on the server ("<html>\r\n<head><title>502 Bad Gateway</title></head>\r\n<body>\r\n<center><h1>502 Bad Gateway</h1></center>\r\n<hr><center>nginx</center>\r\n</body>\r\n</html>") has prevented the request from succeeding (get secrets.meta.k8s.io)May 19 11:18:46 rancher-node-staging-rke2-worker5 rancher-system-agent[75612]: I0519 11:18:46.057753 75612 trace.go:205] Trace[1616138287]: "Reflector ListAndWatch" name:pkg/mod/github.com/rancher/client-go@v1.24.0-rancher1/tools/cache/reflector.go:168 (19-May-2023 11:18:00.312) (total time: 45745ms):May 19 11:18:46 rancher-node-staging-rke2-worker5 rancher-system-agent[75612]: Trace[1616138287]: ---"Objects listed" error:an error on the server ("<html>\r\n<head><title>502 Bad Gateway</title></head>\r\n<body>\r\n<center><h1>502 Bad Gateway</h1></center>\r\n<hr><center>nginx</center>\r\n</body>\r\n</html>") has prevented the request from succeeding (get secrets.meta.k8s.io) 45745ms (11:18:46.057)May 19 11:18:46 rancher-node-staging-rke2-worker5 rancher-system-agent[75612]: Trace[1616138287]: [45.745525005s] [45.745525005s] ENDMay 19 11:18:46 rancher-node-staging-rke2-worker5 rancher-system-agent[75612]: E0519 11:18:46.057882 75612 reflector.go:139] pkg/mod/github.com/rancher/client-go@v1.24.0-rancher1/tools/cache/reflector.go:168: Failed to watch *v1.Secret: failed to list *v1.Secret: an error on the server ("<html>\r\n<head><title>502 Bad Gateway</title></head>\r\n<body>\r\n<center><h1>502 Bad Gateway</h1></center>\r\n<hr><center>nginx</center>\r\n</body>\r\n</html>") has prevented the request from succeeding (get secrets.meta.k8s.io)
root@rancher-node-staging-rke2-worker5:~# systemctl status rke2-agent.service
[3]
May 19 13:08:29 rancher-node-staging-rke2-worker5 rke2[76720]: time="2023-05-19T13:08:29Z" level=info msg="Waiting to retrieve agent configuration; server is not ready: \"overlayfs\" snapshotter cannot be enabled for \"/var/lib/rancher/rk>