Prageeth Liyanage
Prageeth Liyanage

Reputation: 1772

Shall I use separate tables for each and every category or one table to store all attributes for a Classified Website?

I am developing a classified website by using asp.net and my DB is mysql. Please MSSQL users I need your support too. this is a problem with database schema not related to a specific database provider.

I just want a little bit clarification from you.

So in here since this is a classified website you can post Job Ads, Vehicle Ads, Real estate ads etc...

So I have a header table to store common details about the ad. like title, description and so on.

CREATE TABLE `ad_header` (
  `ad_header_id_pk` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `district_id_fk` tinyint(5) unsigned NOT NULL,
  `district_name` varchar(50) DEFAULT NULL,
  `city_id_fk` tinyint(5) unsigned DEFAULT NULL,
  `city_name` varchar(50) DEFAULT NULL,
  `category_id_fk` smallint(3) unsigned NOT NULL,
  `sub_category_id_fk` smallint(3) unsigned DEFAULT NULL,
  `title` varchar(100) NOT NULL,
  `description` text NOT NULL,
  ...............
  PRIMARY KEY (`ad_header_id_pk`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

So if It is a Job ad I have another table for to store the attributes that relevant only to a JOb ad like salary, employment type, working hours

Also If it is a vehicle ad I have separate table to store fuel type, transmission type etc...

So I have 10 categories. These categories are not going to change in a decade. So now I have these 2 approaches

1) One header table and 10 specific tables for to store each categories attributes

2) One header table and One attribute table that will hold all attributes of each and every classified groups. for those that are not relevant will hold NULL values

What is the best way to do this regarding performance and scalability.

For those who build classified websites please give me a guide. Thanks in advance

Upvotes: 1

Views: 1083

Answers (1)

Joel Coehoorn
Joel Coehoorn

Reputation: 416081

The question is not entirely clear to me, but I can give some advice:

First of all, if you find yourself wanting to store delimited values in a single column/cell, you need to step back and create a new table to hold that info. NEVER store delimited data in a single column.

If I understand your question correctly, Ads have Categories like "Job", "For Sale", "Vehicle", "Real Estate", etc. Categories should then have Attributes, where attributes might be things unique to each category, like "Transmission Type" or "Mileage" for the Vehicle category, or "Square Feet" or "Year Constructed" for the Real Estate category.

There is more than one correct way to handle this situation.

If the master categories are somewhat fixed, it is a legitimate design choice to have a separate table for the attributes from each category, such that each ad listing would have one record from ad_header, and one record from the specific Attribute table for that category. So a vehicle listing would have an ad_header record and a vehicle_attributes record.

If the categories are more fluid, it is also a legitimate design choice in this case to have one CateogryAttributes table, that defines the Attributes used with each Category, along with an Ad_Listing_Attributes table that holds the attribute data for each listing, that would include a foreign key to both CategoryAttributes and Ad_header. Note the schema for this table effectively follows Entity/Attribute/Value pattern, which is widely considered to actually be more of an anti-pattern. That is, it's something to be avoided in most cases. But if you expect to be frequently adding new categories, it may be the best you can do here.

A final option is to put attributes from all categories in a single large table, and populate only what you need. So a vehichle listing would have only an ad_header record, but there would be a lot of NULL columns in the record. I'd avoid that in this case, because your ideal scenario would want to require some attributes for certain categories (ie: NOT NULLABLE columns) but leave others options.

This is another case where Postgresql may have been the better DB choice. Postgresql has something called table inheritance, that is specifically designed to address this situation to allow you to avoid an EAV table schema.


Full disclosure: I'm actually a Sql Server guy for most situations, but it does seem like Postgresql may be a better fit for you. My experience is MySql was good in the late 90's and early 00's, but has really lagged behind since. It continues to be popular today mainly because of that early momentum along with some advantage in cheap hosting availability, rather than any real technical merit.

Upvotes: 4

Related Questions