[ICN-615] NS Lookup Failure when Nodus is a Primary CNI Created: 28/Oct/21  Updated: 06/Nov/21

Status: To Do
Project: Integrated Cloud Native NFV
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: High
Reporter: Palaniappan Ram Assignee: Kuralamudhan Ramakrishnan
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

The Nodus is the primary CNI. 

  • When the POD subnet and the ovn-controller-network configmap are set to 10.244.0.0/16, the DNS lookup fails with the following error messages,

$ kubectl exec -it dnsutils – nslookup kubernetes.default

;; reply from unexpected source: 10.244.0.3#53, expected 10.96.0.10#53

;; reply from unexpected source: 10.244.0.3#53, expected 10.96.0.10#53

;; reply from unexpected source: 10.244.0.3#53, expected 10.96.0.10#53

;; connection timed out; no servers could be reached

  • But when the POD subnet  and the ovn-controller-network are set to 10.158.142.0/18 (the default value configured in the ovn-controller-network, the dns seems to work fine

$ kubectl exec -it dnsutils – nslookup kubernetes.default

Server:         10.96.0.10

Address:        10.96.0.10#53

 

Name:   kubernetes.default.svc.cluster.local

Address: 10.96.0.1

  •  But even in the second case above (where the DNS works fine) the emco-monitor errors out and getting restarted continuously on a multi-node cluster. Whereas it works fine in the single node cluster.

Is there any network configuration missing in the host?

 



 Comments   
Comment by Srinivasa Addepalli [ 06/Nov/21 ]

Just to ensure that there are no stale files causing the issue, is it possible to create fresh VMs and try installing K8s on top of it.

Comment by Palaniappan Ram [ 29/Oct/21 ]

When switching the podCIDR, I removed the kubernetes with 

kubeadm reset -f

iptables -F

iptables -X

before reinstallation, on all the nodes. 

There may still be stale config files in the host which need to be removed.

Comment by Kuralamudhan Ramakrishnan [ 28/Oct/21 ]

We have to do kubeadm reset and IPtable flush, when we change the pod network subnet changes in the kubeadm control plane to clean the setup IPtables already assigned service IP with pod network cidr range. 

Comment by Kuralamudhan Ramakrishnan [ 28/Oct/21 ]

I don't think there is hardcoding on these, we have made them as configmap. But there could be case where the ovn resource is pointing to 10.158.142.0/18 even after changing the 10.244.0.0/16 and restarting the nfn plugins. I delete all the ovn resource including nfn plugins and ovn-daemoenset and ovnfolder before changing the subnet. I think, I should document these and add them in deletion hooks when deleting the containers. But, we have to debug the Palani setup to understand more on this issue.

Comment by Srinivasa Addepalli [ 28/Oct/21 ]

Some hardcoding somewhere?

Comment by Kuralamudhan Ramakrishnan [ 28/Oct/21 ]

I see some issue with way the subnet changes happening here. we have to delete the ovn resource before changing the subnet. 

Subnet change should not cause issue dns. Some steps is missing in this case. Let debug today on this.  

Generated at Sat Feb 10 06:01:24 UTC 2024 using Jira 9.4.5#940005-sha1:e3094934eac4fd8653cf39da58f39364fb9cc7c1.