[ICN-615] NS Lookup Failure when Nodus is a Primary CNI Created: 28/Oct/21 Updated: 06/Nov/21 |
|
| Status: | To Do |
| Project: | Integrated Cloud Native NFV |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | High |
| Reporter: | Palaniappan Ram | Assignee: | Kuralamudhan Ramakrishnan |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Description |
|
The Nodus is the primary CNI.
$ kubectl exec -it dnsutils – nslookup kubernetes.default ;; reply from unexpected source: 10.244.0.3#53, expected 10.96.0.10#53 ;; reply from unexpected source: 10.244.0.3#53, expected 10.96.0.10#53 ;; reply from unexpected source: 10.244.0.3#53, expected 10.96.0.10#53 ;; connection timed out; no servers could be reached
$ kubectl exec -it dnsutils – nslookup kubernetes.default Server: 10.96.0.10 Address: 10.96.0.10#53
Name: kubernetes.default.svc.cluster.local Address: 10.96.0.1
Is there any network configuration missing in the host?
|
| Comments |
| Comment by Srinivasa Addepalli [ 06/Nov/21 ] |
|
Just to ensure that there are no stale files causing the issue, is it possible to create fresh VMs and try installing K8s on top of it. |
| Comment by Palaniappan Ram [ 29/Oct/21 ] |
|
When switching the podCIDR, I removed the kubernetes with kubeadm reset -f iptables -F iptables -X before reinstallation, on all the nodes. There may still be stale config files in the host which need to be removed. |
| Comment by Kuralamudhan Ramakrishnan [ 28/Oct/21 ] |
|
We have to do kubeadm reset and IPtable flush, when we change the pod network subnet changes in the kubeadm control plane to clean the setup IPtables already assigned service IP with pod network cidr range. |
| Comment by Kuralamudhan Ramakrishnan [ 28/Oct/21 ] |
|
I don't think there is hardcoding on these, we have made them as configmap. But there could be case where the ovn resource is pointing to 10.158.142.0/18 even after changing the 10.244.0.0/16 and restarting the nfn plugins. I delete all the ovn resource including nfn plugins and ovn-daemoenset and ovnfolder before changing the subnet. I think, I should document these and add them in deletion hooks when deleting the containers. But, we have to debug the Palani setup to understand more on this issue. |
| Comment by Srinivasa Addepalli [ 28/Oct/21 ] |
|
Some hardcoding somewhere? |
| Comment by Kuralamudhan Ramakrishnan [ 28/Oct/21 ] |
|
I see some issue with way the subnet changes happening here. we have to delete the ovn resource before changing the subnet. Subnet change should not cause issue dns. Some steps is missing in this case. Let debug today on this. |