Reputation: 93216
i've got a list of all countries -> states -> cities (-> subcities/villages etc) in a XML file and to retrieve for example a state's all cities it's really quick with XML (using xml parser).
i wonder, if i put all this information in mysql, is retrieving a state's all cities as fast as with XML? cause XML is designed to store hierarchical data while relational databases like mysql are not.
the list contains like 500 000 entities. so i wonder if its as fast as XML using either of:
Adjacency list model
Nested Set model
And which one should i use? Cause (theoretically) there could be unlimited levels under a state (i heard that adjacency isn't good for unlimited child-levels). And which is fastest for this huge dataset?
Thanks!
Upvotes: 2
Views: 955
Reputation: 838376
In this article Quassnoi creates a table with 2,441,405 rows in a heirarchical structure, and tests the performance of highly optimized queries for nested sets and adjacency lists. He runs a variety of different tests, for example fetching ancestors or descendents and times the results (read article for more details of exactly what was tested):
Nested Sets Adjacency Lists All descendants 300ms 7000ms All ancestors 15ms 600ms All descendants up to a certain level 5000ms 600ms
His conclusion is that for MySQL nested sets is faster to query, but has a drawback that it is much slower to update. If you have infrequent updates, use nested sets. Otherwise prefer adjacency lists.
You might also wish to consider if using another database that supports recursive CTEs is an option for you.
I would imagine that an XML file of this size would take a reasonably long time to parse, but if you can cache the parsed structure in memory rather than reading it from disk each time then queries against it will be very fast.
Note that the main drawback of using MySQL for storing heirarchical data is that it requires some very complex queries. Whilst you can just copy the code from the article I linked to, if ever you need you modify it slightly then you will have to understand how it works. If you prefer to keep things simple then XML definitely has an advantage as it was designed for this type of data and so you should easily be able to create the queries you need.
Upvotes: 3