[REC-71] Restore node to service after recovery from Kubelet failure Created: 12/Nov/19 Updated: 12/Nov/19 |
|
| Status: | In Progress |
| Project: | Radio Edge Cloud |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Medium |
| Reporter: | Paul Carver | Assignee: | Unassigned |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Epic Link: | Bug Reports/Fixes |
| Description |
|
After an event where kublet fails, it is possible that a node does not get automatically restored to service even if it recovers successfully and is ready to be used. The HA code needs to be reviewed and updated to handle autorestoration when appropriate. kubelet_healthcheck.service was not restarted as anticipated when the kubelet.service did and the node were never uncordoned by it. |