J.Zil
J.Zil

Reputation: 2449

Geocode lookup in C

I want to do a super fast geocode lookup, returning co-ordinates for an input of Town, City or Country. My knowledge is basic but from what I understand writing it in C is a good start. I was thinking it makes sense to have a tree structure like this:

In my file / database I will have the co-ordinate and the town/city name. If give my program the name "Kent" I want a program that can return me the co-ordinate assoaited with "Kent" in the fastest way possible

Should I store the data in a binary file or a SQL database for performance reasons? What is the best method of searching this data? Perhaps binary tree searching? How should the data be stored? perhaps?

Upvotes: 1

Views: 393

Answers (3)

Thomas Matthews
Thomas Matthews

Reputation: 57718

You should not worry about how the information is stored, except not to duplicate data.

You should create one or more indices for the data. The indicies are associative arrays / maps data structures that contain a key (the item you want to search) and a value (such as the record and other information associated with the key). This will enable you with fast lookups without altering your data for each type of search.

On the other hand, your case is an excellent fit for a data base. I suggest you let the database manager your data (such as efficient lookups). After all, that is what they live for.

See also: At what point is it worth using a database?

Upvotes: 1

High Performance Mark
High Performance Mark

Reputation: 78324

Here's a little advice, but not much more than that:

If you want to find places by name, or name prefix, as you indicate that you wish to, then you would be ill-advised to set up a data structure which stores the data in a hierarchy of country, region, town as you suggest you might. If you have an operation that dominates the use of your data structure you are generally best picking the data structure to suit the operation.

In this case an alphabetical list of places would be more suited to your queries. To each place not at the topmost level you would want to add some kind of reference to the name of its 'parent'. If you have an alphabetical list of places you might also want to consider an index , perhaps one which points directly to the first place in the list which starts with each letter of the alphabet.

As you describe your problem it seems to have much more in common with storing words in a dictionary (I mean the sort of thing in which you look up words rather than any particular collection data-type in any specific programming language which goes under the same name) than with most of what goes under the guise of geo-coding.

My guess would be that a gazetteer including the names of all the world's towns, cities, regions and countries (and their coordinates) which have a population over, say, 1000, could be stored in a very simple data structure (basically a list) with an index or two for rapid location of the first A place-name, the first B, and so on. With a little compression you could probably hold this in the memory of most modern desktop PCs.

Upvotes: 4

Robert H
Robert H

Reputation: 11730

I think the best advice I can give is to use whatever language you are familiar with to get the results you want. Worry about performance once your code works. Then you can look at translating very specific pieces of functionality into C or C++ one at a time until you have the results you want.

Upvotes: 1

Related Questions