Reputation: 988
I am writing a web-spider program with C. Now I am given a url's list, and first I need to get the server IP address using function: getaddrinfo
, and then ridiculous thing happended:
In the url's list there are about 4,000,000 url's, the first about 6,000 url's are processed very well, and then suddenly all of the url's behind failed! getaddrinfo
returns "temporary failure in name resolution" for every url. In addition, if I restart program from the first 'bad' url, it work again.
I am really confused and stuck for 2 days, I felt that the DNS is working well, but some limited resources have been used up, can any one give me some suggestions?
Upvotes: 1
Views: 8101
Reputation: 104080
I wonder if your ISP has killed your spider on grounds that it is acting very like a worm.
Consider running a local DNS recursor such as PowerDNS recursor that can provide caching of already-retrieved information and will perform the lookups entirely itself -- it won't rely on an ISP-provided DNS server, so it is less likely rate-limits on your ISP's equipment will influence your program.
Upvotes: 0
Reputation: 399891
Are you calling freeaddrinfo()
on the returned address information? Very basic, but since you're not showing your code it's the first theory that comes to mind.
Upvotes: 2
Reputation: 26322
You may be hitting some sort of rate limiting in your DNS server. As with all network problems, run Wireshark: check if the DNS requests which are failing are actually being sent, and if so, what reply they're getting.
Upvotes: 4