MrJ
MrJ

Reputation: 1938

More efficient to have more columns or more rows?

I'm currently redesigning a database which could contain a lot of data - I have the option to either include a number of different columns in the database or use a lot of rows instead. It's probably easier if I did some kind of outline below:

item_id | user_id | title | description | content | category | template | comments | status
-------------------------------------------------------------------------------------------
1       | 1       | ABC   | DEF         | GHI     | 1        | default  | 1        | 1
2       | 1       | ZYX   |             | QWE     | 2        | default  | 0        | 1
3       | 1       | A     |             | RTY     | 2        | default  | 0        | 0
4       | 2       | ABC   | DEF         | GHI     | 3        | custom   | 1        | 1
5       | 2       | CBA   |             | GHI     | 3        | custom   | 1        | 1

Versus something in the following structure:

item_id | user_id | attribute   | value
---------------------------------------
1       | 1       | title       | ABC
1       | 1       | description | DEF
1       | 1       | content     | GHI
...     | ...     | ...         | ...

I may want to create additional attributes in the future (50 for arguments sake) - so there could be a lot of empty cells if using multiple columns. The attribute names would be reused, where possible, across different types of content - say a blog entry, event, and gallery - title would easily be reused.

So my question is, is it more efficient to use multiple columns or multiple rows - in terms of query speed and disk space. Or would you instead recommend relationship tables, so there's a table for blogs, a table for events, etc. I'm just trying to come up with an easily expandable solution, where I ideally do not want to create a table for every kind of content as I'm thinking of developers creating new kinds of content via an app/API system (with attributes being tightly controlled).

Supplementary Question if Multiple Rows

How could I, in MySQL, convert multiple rows into a usable column format (I guess temporary tables) - so I could do some filtering by content type, as an example.

Upvotes: 12

Views: 16300

Answers (3)

gimg1
gimg1

Reputation: 1167

This question is very difficult to answer as it all comes down to what you are looking for and how your database will grow in size and complexity over time. I find the best way to answer these types of questions is to read case studies from other successful sites. For example Reddit would be a case study where they use a lot of rows but very little tables and/or columns. The article is here and a question on it is here.

There is also the option of exploring a NoSQL solution which may be more applicable to what you are trying to achieve.

Google case studies of sites that would have a similar structure to your own and see how they accomplished it as they have most likely encountered all the issues you will and already overcome them.

Upvotes: 2

Axel Amthor
Axel Amthor

Reputation: 11096

Basically, mysql has a variable row length as long as one does not change the on a per table level. Thus, empty cols will not use any space (well, almost).

But with blobs or text columns, it might be better to normalize those, as these may have large data to store and this needs to be read / skipped every time a table is scanned. Even if the column is not in the result set and you're doing queries outside of an index, it will take it's time on a large amount of rows.

As a good practice I think it will be fast to put all administrative and often used cols in one table and normalize all the rest. A kind of "vertical" design as in your second example will be complex to read and as soon as you work with temporary tables you will run in to performance issues sooner or later.

Upvotes: 3

Jon G
Jon G

Reputation: 4164

For a traditional row-based store, the cost of spooling through rows will depend on their width, so scanning a table with wide rows will take longer than one with narrow rows.

That said, it you're using an index to locate the rows that are of interest, this won't be so much of an issue.

If you normalise your data by replacing columns with keys to rows in other tables, you can reduce the amount of storage if the linked tables end up being significantly smaller than the original table, however any query will need to include the cost of required joins into the related table.

As with all these things, it's a balancing act that depends on your requirements, but understanding what's going on under the hood can certainly help you to make more informed decisions.

Upvotes: 2

Related Questions