Reputation: 1491
I have a unique situation. Users of my site can submit articles for other users to review, however they can restrict who can review the articles by age and by country. My issue is that instead of having 250 records (1 for each country linking to the article) I thought it fitting to store all 250 countries (or whichever countries they want to be visible to other users) in JSON format in a text field in the database. This way I would only need one record per article. I'm not sure if the performance will suffer terribly? The site will handle between 1-2 million users and the number of articles submitted for review will be rather large as well. The only "processing" that would be done is each user's country is stored in the database and it would be checked against the country array for an article to see if that user is allowed to review that article.
What do you guys think? Am I over-thinking 250 records for each article is a lot?
Upvotes: 4
Views: 4729
Reputation: 33504
I think that storing the data in a lookup table is perfectly acceptable. It gives you much more freedom in the future if something changes, and as long as you nicely index the tables, performance won't be sufffering too much.
Mysql easily handles data that is billions of records. Yes, you will need to ensure that you look after your data integrity - but adding a column to a lookup table versus changing an object that is stored in every single record suddenly seems much easier.
Just make sure that you are keeping the data properly - as in you aren't repeating information that doesn't have to be repeated. Keep countries in one table, and a simple ID in the lookup table that references it.
Upvotes: 5
Reputation: 52107
JSON will prevent the DBMS from checking the validity of countries that you wish to store. It's basically an opaque text, so the DBMS cannot enforce referential integrity (foreign keys).
And even if you don't need to query on countries (which is a pretty big if), you'll at least need to parse JSON before checking for a particular country.
JSON can be a good match for hierarchical data, but this is just a simple set (a country is either element of the set or not), which can be nicely represented by a separate junction table ARTICLE_COUNTRY, which can then be maintained and searched efficiently:
This junction table would link only to countries in which the article is accessible. If most articles are accessible from most countries, you could even reverse the meaning of the junction table and only store "forbidden" countries, thus lowering the overall number of rows.
Upvotes: 0
Reputation: 63
You can have a separate table for Countries and have their IDs stored with the article table.
You can have Options for All Countries, Asia, Europe, North America, South America etc. stored in your countries tables.
Upvotes: 0
Reputation: 16298
You have a "country" table and a "article" table. I would make a third "country-article" consisting only of the indexes that should match. After all Mysql is relational. If you're worried about performance, benchmark.
Upvotes: 1
Reputation: 1751
I would use another table instead for that data, and make a unique column to match it.
Upvotes: 1
Reputation: 3911
One option might be to use a bitfield to represent your countries, using the MySQL bit or binary type. This would allow you to store the information for each contry in a single bit, which would only end up requiring 32 additional bytes per record(8 bits/byte * 32 bytes = 256 bits).
I'm not certain, but it might even be possible to query using bitwise operations, which could potentially be very fast.
Upvotes: 0
Reputation: 6432
In short I would say that storing Json Data within a column in a Relational database is fine if you are not going to be querying data based on that column.
If you will be needing to look up data based on that column there would be a huge performance hit to having to parse the json prior to excluding data thus this would be a no no.
We ran into this issue at my job on a smaller scale and storing json of the properties in the database has worked well to not increase the complexity of the database for non searched properties.
Upvotes: 3