BDuelz
BDuelz

Reputation: 3928

Is this a good DB schema for locations

I'm working on an application that is location specific -- think of it as a store locator where store owners enter their address information and other users can only see nearby stores within a certain range. However, it's a little different in the sense that an exact location is not required, only the city/state is required (weird, I know). I have thought about the schema for storing locations, and have decided on this one.

Locations

id                      -- int
formatted_address       -- varchar(200)
is_point_of_interest    -- bool
name                    -- varchar(100) -- NULL
street_number           -- varchar(10)  -- NULL
street                  -- varchar(40)  -- NULL
city                    -- varchar(40)
state                   -- varchar(40)
state_code              -- varchar(3)
postal_code             -- varchar(10)
country                 -- varchar(40)
country_code            -- varchar(3)
latitude                -- float(10,6)
longitude               -- float(10,6)
last_updated_at         -- timestamp

Here are some notes about the application:

I understand that the schema is not very normalized, but I also do not see the need to normalize it any more because locations are very complex, which is why I'm relying on a stable geocode service (google). Also, I plan to allow freeform text input/search, so theres no need for any dropdown lists.

Does anybody see anything wrong or have any improvements, taking into consideration what I've mentioned? I can see this table growing rather large.

Upvotes: 0

Views: 3008

Answers (2)

user166390
user166390

Reputation:

I do not think so. Here is my two-minute synopsis:

This very badly normalized. At least city->country should be moved out to a different table (and normalized from there). I believe postal codes can cross city boundaries though (or I am very badly misremembering); I am not aware of such a city that crosses a state boundary.

formatted_address is an "optimization" and should likely be a computed field: that is, all the data to re-create it should exist elsewhere. (This means that it doesn't need to worried about now.)

Happy designing.


The simple "more-normalized" form just doing the above proposed:

LOCATIONS
location_id             -- int
is_point_of_interest    -- bool
name                    -- varchar(100) -- NULL
street_number           -- varchar(10)  -- NULL
street                  -- varchar(40)  -- NULL
city_id                 -- int
postal_code             -- varchar(10)
latitude                -- float(10,6)
longitude               -- float(10,6)
last_updated_at         -- timestamp

CITIES
city_id                 
name                    -- varchar
-- similarly, the following should be normalized to STATES and COUNTRIES
state                   -- varchar(40)
state_code              -- varchar(3)
country                 -- varchar(40)
country_code            -- varchar(3)

Of course, CITIES can be further normalized, and so could a POSTALS table: I don't know enough about postal codes, or the application domain though. postal_code acts as part of an implicit compound-surrogate-FK so it's not super terrible as it is there. However, moving it into a separate table could easily allow verification and integrity constraints.

Edit: Normalizing a POSTALs table would be best, as only a very samll number of postal codes are valid for a given city: I am not sure the relation between a postal code and a city, though, so I can't recommend how to do this. Perhaps look at existing schemas used?

Upvotes: 1

Eugen Rieck
Eugen Rieck

Reputation: 65274

Since you say "I truly only need the lat/lon" I encourage you to use 2 tables with a 1:1 relationship.

In many (most?) cases a LOT more of lat/lon pairs will be cached, speeding up your workhorse. If you need the additional info, get it when you need it.

Short form: Dont force the DB to move data you don't need through IO and RAM

Additionally, such a schema would keep your doors open for further natural expansion: Linking other info can be done by adding other tables rather than altering existing ones. I consider this a good thing for your SW quality.

Upvotes: 1

Related Questions