News Database Design for a Desktop Application

Question

I am making an application that will process a big chunk of information. This information was retrieved using some web crawlers and is about news, containing data such as News Title, URL, Publication Date, Category and Content. The crawled data is in XML format, and I will load that information into my application.

From there, 10 users will process every news and tag the category of the news manually by reading the title. I've used 9 main categories to be used, and for each news the users will decide from 0-5 how much this particular news belongs to each category.

Users will also search for news by title and decide if this news is discussing the same event as another news, or if this news is similar to another news (like a news about a football game at 3 PM and a news about a football game at 5 PM).

I have no problems with making the application itself, I just need some help with the part of how to design one/many tables that can link news that talk about similar events or about the same event, since there can be many many news that can talk about the same event.

So far I've done something like this:

Table News

ID
Title
URL
PublicationDate
NewsContent

Table Category

NewsID
User_ID
Economy
Politics
Present_Day
Sport
Technology
Showbiz
Culture
Region
World

Table User

ID
FirstName
LastName

Each category field in the Category table holds a tinyint data type (I'm using SQL Server) and I've restricted the values the user can input via a check constraint so that it allows values from 0 - 5. I don't know if this is the right approach so far for the database design, and I need to add the table/s that allow to add information about similar news or news that talk about the same event, such as a table Similar_News containing some fields like News_ID, SimilarNews1_ID, SimilarNews2_ID and so on, and must contain the information about which user 'said this', but this sounds like a flawed design to me.

Any help is appreciated, thank you.

phonetic_man · Accepted Answer

Here are some suggestions. In the CATEGORY table you have created 9 different columns (economy, politics etc). What if a few days / months / years down the line there is a new category. In this case you will have to modify your database design. Instead, you could have the following structure of your CATEGORY table.

CategoryId
Category

And have one more table to store the actual news categorization.

Table: NewsCategory
NewsId
CategoryId
CategoryWeight (This will store the rating from 1-5)

If the user feels that a news does not belong to a particular category then no row will be inserted in this table for that category. Such a structure will give you more flexibility to insert new categories in future without changing the database design. You just have to insert new rows in the Category table.

For storing similar news I would recommend one of the following approach.

Create a table event and store the details of the event in this table.

EventId
EventDescription

Sample Data
EventId: 55
EventDescription: Euro 2016 Belgium vs Italy

Now you can include this EventId in your News table. This way you can pull up all the news that are related to this event.

News Database Design for a Desktop Application

Answers (2)

Related Questions