I will describe a problem using a specific scenario: Imagine that you create a website towhich users can register, and after they register, they can send Private Messages to each other. This website enables every user to maintain his own Friends list , and also maintain a Blocked Users list , from which he prefers not to get messages. Now the problem: Imagine this website getting to several millions of users, and let's also assume that every user has about 10 Friends in the Friends table, and 10 Blocked Users in the Blocked Users table. The Friends list Table, and the Blocked Users table, will become very long, but worse than that, every time when someone wants to send a message to another person "X", we need to go over the whole Blocked Users table, and look for records that the user "X" defined - people he blocked. This "scanning" of a long database table, each time a message is sent from one user to another, seems quite inefficient to me. So I have 2 questions about it: What are possible solutions for this problem? I am not afraid of long database tables, but I am afraid of database tables that contain data for so many users, which means that the whole table needs to be scanned every time , just to pull out a few records from it for that specific user. A specific solution that I have in my mind, and that I would like to ask about: One solution that I have in mind for this problem, is that every user that registers to the website, will have his own "mini-database" dynamically (and programmatically) created for him, that way the Friends table, an the Blocked Users table, will contain only records for him . This makes scanning those table very easy, because all the records are for him. Does this idea exist in Databases like MS-SQL Server, or MySQL? And If yes, is it a good solution for the described problem? (each user will have his own small database created for him, and of course there is also the main (common) database for all other data that is not user specific) Thank you all

mysqlsql-serverdatabaseperformancearchitecture

Reputation: 1101

What Are Good Solutions for a Database Table that Gets to Long?

I will describe a problem using a specific scenario:

Imagine that you create a website towhich users can register,
and after they register, they can send Private Messages to each other.

This website enables every user to maintain his own Friends list,
and also maintain a Blocked Users list, from which he prefers not to get messages.

Now the problem:

Imagine this website getting to several millions of users,
and let's also assume that every user has about 10 Friends in the Friends table, and 10 Blocked Users in the Blocked Users table.

The Friends list Table, and the Blocked Users table, will become very long,
but worse than that, every time when someone wants to send a message to another person "X",
we need to go over the whole Blocked Users table, and look for records that the user "X" defined - people he blocked.

This "scanning" of a long database table, each time a message is sent from one user to another, seems quite inefficient to me.

So I have 2 questions about it:

What are possible solutions for this problem?
I am not afraid of long database tables,
but I am afraid of database tables that contain data for so many users,
which means that the whole table needs to be scanned every time, just to pull out a few records from it for that specific user.
A specific solution that I have in my mind, and that I would like to ask about:
One solution that I have in mind for this problem, is that every user that registers to the website, will have his own "mini-database" dynamically (and programmatically) created for him,
that way the Friends table, an the Blocked Users table, will contain only records for him.
This makes scanning those table very easy, because all the records are for him.
Does this idea exist in Databases like MS-SQL Server, or MySQL? And If yes, is it a good solution for the described problem?
(each user will have his own small database created for him, and of course there is also the main (common) database for all other data that is not user specific)

Thank you all

Upvotes: 0

Answers (4)

benjamin moskovits

Reputation: 5458

I would wait on the partitioning and on creating mini-database idea. Is your database installed with the data, log and temp files on different RAID drives? Do you have clustered indexes on the tables and indexes on the search and join columns?

Have you tried any kind of reading Query Plans to see how and where the slowdowns are occurring? Don't just add memory or try advanced features blindly before doing the basics.

Creating separate databases will become a maintenance nightmare and it will be challenging to do the type of queries (for all users....) that you will probably like to do in the future.

Partitioning is a wonderful feature of SQL Server and while in 2014 you can have thousands of partitions you probably (unless you put each partition on a separate drive) won't see the big performance bump you are looking for.

SQL Server has very fast response time for tables (especially for tables with 10s of millions of rows (in your case the user table)). Don't let the main table get too wide and the response time will be extremely fast.

Upvotes: 2

jean

Reputation: 4350

I did it once for a social network system. Maybe you can look for your normalization. At the time I got a [Relationship] table and it just got

UserAId  Int
UserBId  Int
RelationshipFlag  Smallint

With 1 million users and each one with 10 "friends" that table got 10 millions rows. Not a problem since we put indexes on the columns and it can retrieve a list of all "related" usersB to a specific userA in no time.

Take a good look on your schema and your indexes, if they are ok you DB ill not got problems handling it.

Edit

I agree with @M.Ali

Mini-Database for each user is a definite no-go zone.

IMHO you are fine if you stick with the basic and implement it the right way

Upvotes: 1

M.Ali

Reputation: 69514

Mini-Database for each user is a definite no-go zone.
Plus on a side note A separate table to hold just Two columns UserID and BlockedUserID both being INT columns and having correct indexes, you cannot go wrong with this approach , if you write your queries sensibly :)
look into table partitioning , also a well normalized database with decent indexes will also help.
Also if you can afford Enterprise Licence table partitioning with the table schema described in last point will make it a very good , query friendly database schema.

Upvotes: 1

CShannon

Reputation: 135

Right off the bat my first thought is this:

https://msdn.microsoft.com/en-us/library/ms188730.aspx

Partitioning can allow you to break it up into more manageable pieces and in a way that can be scalable. There will be some choices you have to make about how you break it up, but I believe this is the right path for you.

In regards to table scanning if you have proper indexing you should be getting seeks in your queries. You will want to look at execution plans to know for sure on this though.

As for having mini-DB for each user that is sort of what you can accomplish with partitioning.

Upvotes: 1

What Are Good Solutions for a Database Table that Gets to Long?

Answers (4)

Related Questions