SuperFrog
SuperFrog

Reputation: 7674

Storing detailed data in SQL Server

I’m designing a database in which I save votes. I’ve created a table:

CREATE TABLE [dbo].[users_votes](
    [id] [bigint] NOT NULL,
    [like_votes] [int] NOT NULL DEFAULT ((0)),
    [dislike_votes] [int] NOT NULL DEFAULT ((0)),
    [commented_votes] [int] NOT NULL DEFAULT ((0)),
    [comments_likes] [int] NOT NULL DEFAULT ((0))

The issue is that there is a requirement to also store the breakdown data by location. So for example if user_votes table has 1,000 like_votes for a specific id, I need to know the break down by location, e.g.:

United States 340

France 155

Denmark 25

Brazil 290

Australia 190

I’m getting the data from the client as comma delimited String, for example: (1,2,45,67,87,112,234) and the country code for location (us, au, ca, etc...).

I’ve been thinking about a few possibilities to store this data, but wanted to know which of these approached is best suited (if any).

  1. As the number of country codes is finite, I can expand users_votes table and add columns with country codes for each criteria. E.g. like_votes_us, dislike_votes_us, comment_votes_us, comment_likes_us. In this case I will probably use Dynamic SQL to insert/update the data.

  2. Create new tables for each column, so for example I will have a table named like_votes, in which I will have an id, external_id which will be users_votes (table) id, country_code, and count column. So the data will be stored in users_votes and also in like_votes table. I will have a record for each combination of external_id and country code. In this case I will need to iterate the inserted data in order to determine if this external_id combination exists (and then just increment it) or it needs to be inserted.

Which approach, if any, is the optimal way to store this data so it will be easy to insert/update and also to query?

Upvotes: 0

Views: 74

Answers (1)

Thom A
Thom A

Reputation: 96045

This type of table design you have at the moment isn't a good idea, in all honesty. One big important point of building a good relational database is using Normal Form. i'm not going to explain what that is here, as there are 10's of thousands of articles on the internet explaining it, and its different iterations (from 1NF to 6NF iirc).

Anyway, you can easily do this with a few tables. I having to guess a lot of your set up here, but hopefully you'll be able to extrapolate what you need, and adjust what doesn't.

Firstly, let's start with a client table:

CREATE TABLE dbo.Client (ClientID int IDENTITY(1,1),
                         ClientName varchar(100), --You should really split this into Title, Forename and Surname, I'm just being "lazy" here
                         ClientCountryID int, --Not sure if a Client is related to a country or the vote is, i've guessed the client is.
                         DOB date,
                         EmailAddress varchar(100));
GO

So, we have a simple Client Table now. Next, we want a Country Table. This is very simple:

CREATE TABLE dbo.Country (CountryID int IDENTITY(1,1),
                          CountryName varchar(100),
                          CountryCode char(2)); --For example UK for United Kingdom, FR for France, etc
GO

You might want to store additional content there, but I don't know your set up.

Now, this is where I'm really guessing a lot. I'm assuming that your likes and dislikes, etc, are linked to something. What, I have no idea, so, I'm going to have a table called "Content", however, not knowing what these likes are against, I have no context for this table, thus it's going to be very basic:

CREATE TABLE dbo.Content (ContentID int IDENTITY(1,1),
                          ContentType int, --Guessing might be types, maybe videos, Comments, articles? I have no idea to be honest)
                          ContentParent int, --Comments are joined to a Content (just like here on SO)? I'll guess it's possible
                          Content nvarchar(MAX)); --because I have no idea what's going in there

--Very simple Content Type Table
CREATE TABLE dbo.ContentType (TypeID int IDENTITY(1,1),
                              TypeDescription varchar(100));
GO

Now, finally, we can get onto the votes that you want to store; which might look something like this:

CREATE TABLE dbo.Vote (VoteID int IDENTITY(1,1),
                       ClientID int,
                       ContentID int,
                       Liked bit); --1 for Liked, 0 for Disliked, NULL for N/A perhaps?
GO

Ok, now we have some tables. Now I realise I haven't given any kind of Sample data to go in here, so I'll provide a few INSERTS statements for you, so you can get the idea:

INSERT INTO dbo.Country (CountryName, CountryCode)
VALUES ('United Kingdom','GB'),
       ('France','FR'),
       ('Germany','DE');
GO

INSERT INTO dbo.Client (ClientName, ClientCountryID, DOB, EmailAddress)
VALUES ('Mr John Smith',1, '19880106','[email protected]'),
       ('Ms Penelope Vert',2,'19930509','[email protected]');
GO
INSERT INTO dbo.ContentType (TypeDescription)
VALUES ('Video'),('Article'),('Comment');
GO
INSERT INTO dbo.Content (ContentType, ContentParent, Content)
VALUES (2, NULL, 'This is my first article, hi everyone!'),
       (3, 1, 'Nice! Good to see you''re finally posting!'),
       (1, NULL, 'http://youtube.com');
GO

--And now some votes:
INSERT INTO dbo.Vote (ClientID, ContentID, Liked)
VALUES (1, 1, 1),
       (2, 1, 1),
       (2, 2, 1),
       (2, 3, 0);
GO

Notice how I've put the votes in. I've not aggregated in the table; doing so is an awful idea. instead store each vote individually and use a query to Aggregate. You can easily do this, for example:

SELECT C.ContentID,
       Cy.CountryName,
       COUNT(CASE V.Liked WHEN 1 THEN 1 END) AS LikedVotes,
       COUNT(CASE V.Liked WHEN 0 THEN 1 END) AS DisLikedVotes
FROM dbo.Content C
     JOIN dbo.Vote V ON C.ContentID = V.ContentID
     JOIN dbo.Client CV ON V.ClientID = CV.ClientID
     JOIN dbo.Country Cy ON CV.ClientCountryID = Cy.CountryID
GROUP BY C.ContentID,
         Cy.CountryName;

This gives you the number of Liked Votes per Content Item, and splits it into Countries as well for you. If you want to put these countries into their own columns, then I strongly suggest doing this in your presentation layer, not your SQL (as you'll have to use Dynamic SQL, and (no offence) I imagine this is beyond your skills at the moment based on your current database design choice(s)). Excel is very good at going this using Pivot tables. if you want to retain the process in SQL Server, consider using SSRS and a matrix.

If you have any questions, please do ask.

Note: I have no made any kind of foreign keys, constraints, Default values, etc here. These are a definate must for any good database design.

Clean Up script:

DROP TABLE dbo.Client;
DROP TABLE dbo.Country;
DROP TABLE dbo.Vote;
DROP TABLE dbo.Content;
DROP TABLE dbo.ContentType;
GO

Upvotes: 3

Related Questions