Issue : DSM single node AWS cluster unhealthy after powering cycle
affected version: DSM 4.4
Description: After power cycle of single node DSM, Flannel pod enters crashloop state and not recovering.
Flannel pod log error : “Failed to create SubnetManager”
Fix: usually cluster will be auto-recovered in 10-15mins. if not please follow the below steps.
Note: This issue is limited to a single-node setup due to a race condition with kubelet, kube proxy, and flannel
systemctl stop kubelet
- wait 30-45 seconds
systemctl start kubelet
- Restart the kube-proxy pod