Zerium
Zerium

Reputation: 17333

What is the most database-efficient way to store and show the people who recently visited another person's page?

I have a blogging program (sorta like Twitter), and I am currently making a recently visited box, which contains 9 people who most recently visited your page.

Everyone is registered with a different username.

What I've currently got is a database like this:

-----------------------------
| id | username | who_visit |
-----------------------------

For example, if 9 users foo1 to foo9 visited foo10's page, the database would be populated with a row:

------------------------------------------------------------------------
| id | username |                       who_visit                      |
------------------------------------------------------------------------
| 1  |  foo10   | foo1, foo2, foo3, foo4, foo5, foo6, foo7, foo8, foo9 |
------------------------------------------------------------------------

And then when foo11 visits foo10's page, I would remove foo9 from the end of the string, and add foo11 to the front.

But the main problem now is, what if foo1 visited foo10's page, then foo2 visited foo10's page, and then foo1 visited foo10's page again? Then I would have to search the 9 users, remove any duplicates, place this one in front, and then continue executing. But the problem is, then it would only show 8 rows.

The only solution to this problem I could think of was making a database like this:

-----------------------------
| id | username | who_visit |
-----------------------------

And instead of populating them in one row, I would add a new row for every visit:

-----------------------------
| id | username | who_visit |
-----------------------------
| 1  |  foo10   |    foo1   |
-----------------------------
| 2  |  foo10   |    foo2   |
-----------------------------
| 3  |  foo10   |    foo3   |
-----------------------------
| 4  |  foo10   |    foo4   |
-----------------------------
| 5  |  foo10   |    foo5   |
-----------------------------
| 6  |  foo10   |    foo6   |
-----------------------------
| 7  |  foo10   |    foo7   |
-----------------------------
| 8  |  foo10   |    foo8   |
-----------------------------
| 9  |  foo10   |    foo9   |
-----------------------------

But then this would takes heaps and heaps of unnecessary space.

Is there a method I missed out on, which can efficiently solve this problem without adding > 50000 rows for one user in the database?

Update: For those with the same problem, as PM 77-1 stated below in the comments, one could delete the earliest duplicate row when a new row is inserted. This way, you won't get 'data-bloat'.

Upvotes: 2

Views: 520

Answers (5)

MaxOvrdrv
MaxOvrdrv

Reputation: 1916

Use a relationship table instead... creating multiple IDs within your Users table is not recommended for obvious reasons...

for example:

Users Table

[UserID][UserName]

Visits Table

[Source_User_ID][Visitor_User_ID][Visit_Count]

Then your SQL statements become much simple with:

SELECT TOP 9 [Visitor_User_ID] WHERE [Source_User_ID]=### ORDER BY [Visit_Count] DESC

Upvotes: 0

Smith Smithy
Smith Smithy

Reputation: 585

Your second method is the best. When i first started implementing databases in my apps I tried your first method. It creates problems when you want to expand or change things up in regard to how that data set is handled.

You should have no problem sorting through this data quickly if indexed properly.

You still want to remove the oldest row from the whovisit table. This will prevent your 50k entries. In theory you keep only 9 records in the whovisit table for each user. so your actual table size would be 9*Number_of_users

table one users

id   |  username
-----|-----------
1    |  foo1
2    |  foo2

table two whovisit

id   |  user(id) | visited(userId) | Date/time stamp
-----|-----------------------------|----------------
1    |  1        |  2              | 9999-12-31 23:59:59

when you insert the newly visited query the user id and get a row count. If less than nine your fine, if more than 9, then delete the oldest one leaving a total of 9 rows for the user.

Upvotes: 2

michi
michi

Reputation: 6625

your idea is called normalizing and actually a good idea.

table user:

-----------------
| id | name     |
-----------------
| 1  | foo1     |
-----------------
| 2  | foo2     |
-----------------
| 3  | foo3     |
-----------------

table visit:

-----------------------------
| id | user_id  | visit_id  |
-----------------------------
| 1  |    1     |    2      |
-----------------------------
| 2  |    2     |    3      |
-----------------------------

now you can easily and quickly store and retrieve visiting data. If you put that into one field (like in your 1st example), you end up in programmer's hell.

You could include a timestamp in table visit and delete entries older than x days.

Upvotes: 0

PM 77-1
PM 77-1

Reputation: 13334

It seems it will be helpful to add a date/time stamp of each visit.

If you do, the logic could be like this:

  • user already on the list - update the earliest time/stamp with current date/time
  • user is not yet on the list - find the overall earliest visit and update the record with this user's information

Upvotes: 2

Hazzit
Hazzit

Reputation: 6882

I would suggest using two tables:

Table users

id | name
1  | foo1
2  | foo2
3  | foo3
4  | foo4   
  ...
10 | foo10

Table visits

host_userid | visitor_userid
    10      |     1
    10      |     2
    10      |     3
    10      |     4

The visits table might also have a date column or a primary key, if necessary. Storing just two integers will result in a very small row size.

Upvotes: 0

Related Questions