Reputation: 77
I need some advice on how to build my DB. I will tell you a general example how it should work.
So the site will have thousands of users - [1 table users with their details i guess - ok here]
Let's say every day there will be some questions/facts posted by admin that users can choose on of the answers,for example 1,2,3. Every user can choose one of the answers on each question.
[1 table with questions i guess -or maybe questions can be separate tables depending the month or year?]
Every user that makes a choise about a specific question for example question/fact #54 will have his answer stored. So a simple thought is to have a new table to store it. For example user1, and question #54 ,answer 1.
But if users are thousands so imagine every day 30-40 questions/facts * thousands answers * days/years etc. That would be too slow I guess.
The other thought is to create 1 table for each user but that would be really bad i think
Imagine that I need to retrieve history of answers and use that data on other modules really fast. Having a table with million or unlimited entries as years pass would be bad to search, right?
Table setup will be few writes, many reads actually.Because reads will be needed all over site.For user with most successfull answers. History of all your answers on your profile. Top users with correct answers per category of question(different categories of questions i forgot to say - so maybe new table for each category? Around 5-10 or similar number of categories) also per month and per year stats.Past years will be just for history purpose on their profile so not so many reads. (so maybe categories have table per year too?)Its all about statistics for every individual user.
So my question is how do you think i should build this?
Thanks in advance
I'm open to more ideas. Also forgot to ask php+mysql or aspx+mssql?
Upvotes: 0
Views: 232
Reputation: 29649
In general, a well-designed database is roughly equally fast when querying a table with millions of records as it is when querying a table with thousands of records - as long as you can access data using an index.
On the other hand, the cost of maintenance goes up very quickly once you get to performance optimization before you need to.
So, I'd recommend designing your database to be easy to understand for developers, to write performance tests, and to only optimize when you really, really need to.
As for your specific question:
Table USERS
user_id (primary key)
name
...
table QUESTIONS
question_id (primary key)
question_date
question_text
table ANSWERS
answer_id (primary key)
question_id (foreign key to questions)
answer_text
table user_answers
user_id (foreign key to users)
answer_id (foreign key to answers)
is_correct_flag
Create indexes on all the keys, and (probably) question_date; if you need to search by username, create an index on that column too.
Now write your data access queries in SQL - don't have to be exactly right, just good enough to let you test. Then use a test data generator to populate your tables - I've used DBMonster in the past. Put twice as much data into the database as you expect to ever need.
Now execute your data access queries, and measure the response time. Do this several times, in different sequences - caching etc. on the database can create misleading results. I've found it useful to encapsulate these using a unit testing framework such as PHPUnit - that way, you can re-run the same tests several times.
If you're lucky, you won't have any performance issues at all. If you're not, use EXPLAIN to optimize the queries. If that doesn't work, consider getting better hardware. If that doesn't work, create pre-computed "reporting tables", which aggregate commonly requested data into a simple, flat structure and are updated either in a batch, or when data changes.
For instance, if you have to report on user scores over a period, you might create a table along the lines of
table USER_SCORE_PERIOD_REPORT
user_id
username
period
score
I like to stick to a naming convention to make sure these "reporting tables" are clearly identified, and not mistaken for regular "transactional" tables.
But really, only do this if you know you have a performance problem - this solution creates more things that can break, with more opportunities for bugs.
Upvotes: 0
Reputation: 20909
What I would consider doing is having a table for the questions and a table for the responses.
Each question would have it's own unique ID that would also appear in the response table.
An example layout for QUESTIONS
would be: QUESTION_ID, TEXT, RESPONSE_VALUES
and an example layout for RESPONSES
would be QUESTION_ID, USER_ID, RESPONSE_ID
.
These kinds of relationships are known as "foreign keys".
You also might want to brush up on "one to many" relationships.
Upvotes: 1