Reputation: 105
it was a working set up and no manual changes were made.
when we are trying to deploy application on aks; it fails to pull an image from the acr.
as per kubectl describe po output:
Failed to pull image "xyz.azurecr.io/xyz:-beta-68": [rpc error: code = Unknown desc = Error response from daemon: Get https://xyz.azurecr.io/v2/: dial tcp: lookup rxyz.azurecr.io on [::1]:53: read udp [::1]:46256->[::1]:53: read: connection refused, rpc error: code = Unknown desc = Error response from daemon: Get https://xyz.azurecr.io/v2/: dial tcp: lookup xyz.azurecr.io on [::1]:53: read udp [::1]:46112->[::1]:53: read: connection refused, rpc error: code = Unknown desc = Error response from daemon: Get https://xyz.azurecr.io/v2/: dial tcp: lookup xyz.azurecr.io on [::1]:53: read udp [::1]:36677->[::1]:53: read: connection refused]
while troubleshooting i realised, few nodes has the dns entry in /etc/resolv.conf where image pull is working fine without issue and few node doesn't have the dns entry in /etc/resolv.conf where the image pull fails.
and if i manually add dns entry to /etc/resolv.conf on the nodes that doesn't have the entry; the changes are reverted to the initial state withing few minutes.
is there a procedure to edit /etc/resolv.conf or fix image pull issues.?
Upvotes: 9
Views: 2047
Reputation: 1
Restarting the nodes solved the acr pull problem https://learn.microsoft.com/en-us/azure/virtual-machine-scale-sets/virtual-machine-scale-sets-manage-cli#restart-vms-in-a-scale-set
Upvotes: 0
Reputation: 21
There is a bug in ubuntu that impacts AKS (global). You can follow the link below to see the status. https://status.azure.com/en-us/status In addition, there is a thread here you can follow the suggestions to overcome this issue. https://learn.microsoft.com/en-us/answers/questions/987231/error-connecting-aks-with-acr.html
Upvotes: 2
Reputation: 21
restart the cluster it will fix the problem ubuntu team have made some DNS issue so this problem started.
Upvotes: 1