Reputation: 1300
I need to create a list of all DNS Queries required to display a large number of sites (ideally up to 1 000 000). The list needs to assign the queries to the page that required them.
Example: Visiting google.com required a DNS query for google.com, ssl.gstatic.com, apis.google.com and other sites. My List would read something along the lines of
google.com:google.com,ssl.gstatic.com,apis.google.com,...
(exact format not relevant here)
I currently have two ideas on how to do this:
Both ideas have problems though. Visiting 1 000 000 Domains with a space of 2 seconds between visits (to make it possible to assign queries to the visited site afterwards), taking about 1 second to load (which is pretty optimistic) would take over 34 days, probably longer. But to build a parser I would need a complete list of all possible forms of embedded content that would result in a DNS Query, and I would need to query some of the target URLs as well (think iframes), and some content would be impossible to check for further queries (think flash content which connects to other servers).
I'm kind of stuck here, and would appreciate some input on how to deal with this. It would be possible to shorten the List of URLs to maybe 100 000, but any less would dramatically reduce the use of the result.
For context: I need this list for my bachelor thesis dealing with a attack strategy on a proposed DNS privacy extension.
Upvotes: 4
Views: 748
Reputation: 125
You can use PhantomJS to do this, as it provides an interface that will let you capture network requests and log them, something along the lines of this example.
You'd need to write some simple Javascript, but as it's Node, it should be fairly easy to run this asynchronously to gather the data you need within a reasonable time.
Upvotes: 1
Reputation: 11
There is a tool that can do this and produce a graphic representation. It is part of dnssec-tools called DNSpktflow (DNS Packet Flow)
It may not do what you want exactly but it is open source so you can see how they do it.
Upvotes: 1