Reputation: 283
I have to make decision how to plan table that will be used to store dates.
I have about 20 different dates for each user and guess 100 000 users right now and growing.
So question is for SELECT query what will work faster if I make table with 20 fields? e.g.
"user_dates"
userId, date_registered, date_paid, date_started_working, ... date_reported, date_fired
20 total fields with 100 000 records in table
or make 2 tables it like first table "date_types" with 3 fields and 20 records for above column names.
id, date_type_id, date_type_name
1 5 date_reported
2 3 date_registerd
...
and second table with 3 fields actual records
"user_dates"
userId, date_type, date
201 2 2012-01-28
202 5 2012-06-14
...
but then with 2 000 000 records ?
I think second option is more universal if I need to add more dates I can do it from front end just adding record to "date_type" table and then using it in "user_dates" however I am now worried about performance with 2 million records in table.
So which option you think will work faster?
Upvotes: 1
Views: 1226
Reputation: 71422
The best way to determine this is through testing. Generally the sizes of data you are talking about (20 date columns by 100K records) is really pretty small in regards to MySQL tables, so I would probably just use one table with multiple columns unless you think you will be adding new types of date fields all the time and desire a more flexible schema. You just need to make sure you index all the fields that will be used in for filtering, ordering, joining, etc. in queries.
The design may also be informed by what type of queries you want to perform against the data. If for example you expect that you might want to query data based on a combination of fields (i.e. user has some certain date, but not another date), the querying will likely be much more optimal on the single table, as you would be able to use a simple SELECT ... WHERE
query. With the separate tables, you might find yourself needing to do subselects, or odd join conditions, or HAVING
clauses to perform the same kind of query.
Upvotes: 1
Reputation: 8099
Usually I go both ways: Put the basic and most oftenly used attributes into one table. Make an additional-attributes table to put rarley used attributes in, which then can be fetched lazily from the application layer. This way you are not doing JOIN's every time you fetch a user.
Upvotes: 0
Reputation: 34063
A longer table will have a larger index. A wider table will have a smaller index but take more psychical space and probably have more overhead. You should carefully examine your schema to see if normalization is complete.
I would, however, go with your second option. This is because you don't need to necessarily have the fields exist if they are empty. So if the user hasn't been fired, no need to create a record for them.
Upvotes: 2
Reputation: 880
If the dates are pretty concrete and the users will have all (or most) of the dates filled in, then I would go with the wide table because it's easier to actually write the queries to get the data. Writing a query that asks for all the users that have date1 in a range and date2 in a range is much more difficult with a vertical table.
I would only go with the longer table if you know you need the option to create date types on the fly.
Upvotes: 1
Reputation: 191819
As long as the user ID and the date-type ID are indexed on the main tables and the user_dates table, I doubt you will notice a problem when querying .. if you were to query the entire table in either case, I'm sure it would take a pretty long time (mostly to send the data, though). A single user lookup will be instantaneous in either case.
Don't sacrifice the relation for some possible efficiency improvement; it's not worth it.
Upvotes: 0