Nathan van Jole
Nathan van Jole

Reputation: 513

One multi-dimensional array vs. multiple tables with joined rows

I'm building a lists website (nothing special, people can just create lists for example 'movies to watch', or 'to do' but it's good to learn things).

The current situation: People sign up, I create a row for them in the users table. In that row, I have a column named Lists. In this column I store a serialized array that is structured like this:

$lists = array(
            "To Do" => array(
                                "List Info" => array(
                                                    "add date" => 20170503,
                                                    "add time" => 130530
                                                    ),
                                array(
                                                    "item" => "Go to the supermarket",
                                                    "add date" => 20170503,
                                                    "add time" => 140503,
                                                    "completed" => false
                                                    ),
                                array(
                                                    "item" => "Clean my room",
                                                    "add date" => 20170503,
                                                    "add time" => 180503,
                                                    "completed" => false
                                                    ),
                                array(
                                                    "item" => "Go jogging",
                                                    "add date" => 20170503,
                                                    "add time" => 200503,
                                                    "completed" => true
                                                    )
                            ),
            "Movies To Watch" => array(
                                "List Info" => array(
                                                    "add date" => 20170504,
                                                    "add time" => 130530
                                                    ),
                                array(
                                                    "item" => "Zootopia",
                                                    "add date" => 20170504,
                                                    "add time" => 150503,
                                                    "completed" => true
                                                    ),
                                array(
                                                    "item" => "Anne The Series",
                                                    "add date" => 20170504,
                                                    "add time" => 190503,
                                                    "completed" => false
                                                    ),
                                array(
                                                    "item" => "Upside Down",
                                                    "add date" => 20170505,
                                                    "add time" => 230503,
                                                    "completed" => false
                                                    )
                            )
         );

This works fine, but I was wondering if it's the right way to do it. Especially since I want to store some information about every list and item (add date, add time, and eventually more..)

Would it be better to make multiple tables: users, lists, list items and then inner join them by user ID?

I was also kinda wondering how big social media websites do this... for example Twitter. Do they add a new row for every tweet or do they push a new tweet into the array... or something else?

Any advice would be very appreciated,

Nathan

Upvotes: 0

Views: 172

Answers (2)

Adam Copley
Adam Copley

Reputation: 1495

The recommended approach would be to create the separate tables like you mentioned. (users, lists, list_items)

Your relationships would be like this:

users -> lists = linked by user_id column in both tables

lists -> list_items= linked by list_id column in both tables

This would also give you the ability for users to have multiple lists without the need for an extra column in your users table.

If you were to continue adding the list items into a serialized array then once the array starts to grow to a certain size you will encounter performance issues because you have a huge column entry to sort through.

Databases are pretty performant these days at looking through million even billions of rows in good time. Just remember to have indexes on all of the fields which tables join on.

You could argue the point about that LONGBLOB field allowing 4GB of text, but it will take a hell of a long time to load that single record. Storing them as rows gives you the flexibility of only retrieving the rows you need at the time. Rather than all of them every time you need even just one

To answer your question about twitter, yes they will have a new row for every tweet.

Quote from comments by alexis which I feel is of value:

Google "normalization" and you'll see that the very first step ("first normal form") is to store only one value in each cell. If you can imagine any scenarios where you'll want to query the database for a single setting, give them all their own column. If the contents of the settings are completely transparent to the database, and especially if they are semi-structured (e.g., a varying set of key-value pairs), then it might be justifiable to keep the database ignorant of what the settings are (i.e. to do what you're doing now).

Some useful links on Database Normalization

Upvotes: 2

Juan
Juan

Reputation: 5589

The short answer is that you will be better of if you use different tables. Otherwise you are limiting the number of lists / contents to the size of the field.

Upvotes: 0

Related Questions