EnexoOnoma
EnexoOnoma

Reputation: 8836

Question on how to improve my database structure and what fields should I index?

I am creating a simple comparison script and I have some questions for the database structure. Firstly the database will be huge, I am expecting more than 1 million entries in products. Secondly, there will be a search form that the search term will look into (%$term%) the field name and display the product's related info and shop's info.

Below you can see my database structure named products.

id int(10) NOT NULL
name varchar(50) NOT NULL
link varchar(50) NOT NULL
description varchar(50) NOT NULL
image varchar(50) NOT NULL
price varchar(50) NOT NULL

My questions are:

  1. Do you suggest me to index a field? Users will not be able to insert or update products, the only query will be SELECT to display the results and I will update the products from XML feeds often for possible products changes.

  2. I have to store the shop info like name, shipping, link, image... This gives me two option. a) To create a new table named shops and join those two tables with a new field in products shopID that will look for the id in shops and display the info or b) Should I add these info (name, shipping, ...) in extra fields in products in every single product ? (I think the answer is obvious but I need your suggestion).

  3. Are there any other things I should have in mind, or change?

I am not an advanced programmer and what I learn is through internet, so maybe the questions are too obvious for you, but for me is the ticket for learning.

Thank you for your answers.

Upvotes: 1

Views: 241

Answers (6)

Dor
Dor

Reputation: 7494

  1. Indexes are required to fetch records very fast. So yes, they're recommended. But what kind of an index would you like to use? MyISAM engine offers "regular" string index that you can use with a LIKE clause (e.g. LIKE 'hello%') but it restricts you from using a wildcard at the beginning of the search phrase. In addition, MyISAM has a FULLTEXT index that allows you to search words in the whole string, not just the beginning of the string. So you could create a FULLTEXT index on the columns description and name - but 2 FULLTEXT indexes seem redundant in this case. Maybe you could join those columns and separate the values with a token or a character? If so, you'll need to create only 1 FULLTEXT index on the joined column, which can save a lot fragmentation and disk space. One of the cons for using MyISAM engine is that when writing to it (UPDATE/DELETE queries) - it locks the entire table. So, if the table is written to many times a minute, it will probably make other queries hang. That's why you should see if InnoDB engine suits your needs - which enables concurrent read/write operations on the table.
  2. That's probably a good idea, since having index on the column price seems essential, and FULLTEXT indexes doesn't work together with other indexes.
  3. I'd say: Use InnoDB and Sphinx, and have a primary index on id & a regular index on price.

The most important thing for you to understand is that when writing a code for specific software, you must be well familiar with that software and it's caveats. You should read High performance MySQL - extremely recommended.


Edit: If you want to add an indexes in the products table, you can do that with
ALTER TABLE /* etc */ when the table is empty or contains small amount of data. If the table has a lot of data, then it's recommended to create another table that's similar to products, altering that new table and populating it with data from the old products table, e.g.:

CREATE TABLE `products_new` LIKE `products`;
ALTER TABLE `products_new` ADD FULLTEXT (`name`);
LOCK TABLES `products` READ, `products_new` WRITE;
INSERT INTO `products_new` SELECT * FROM `products`;
LOCK TABLES `products` WRITE, `products_new` WRITE;
ALTER TABLE `products` RENAME TO `products_bad`;
ALTER TABLE `products_new` RENAME TO `products`;
/* The following doesn't work:
      RENAME TABLE `products` TO `products_bad`, `products_new` TO `products`;
   See: http://bugs.mysql.com/bug.php?id=22246
*/
DROP TABLE `products_bad`;

Upvotes: 1

Andreas Ågren
Andreas Ågren

Reputation: 3929

Here are my suggestions:

To be able to search for %term% you need full-text search, an index will not do you any good when the search-term starts with a wildcard.

  1. Yes you should put an index on the id-column (and probably make it auto increment) since that seems to be the unique column in the table. Other than that there's no point in us suggesting any other indexes since we don't which queries you are going to run.
  2. Yes, create another table for shops, otherwise you will have data that is not normalized, for shop-name and so on (there might be rare cases that "require" de-normalization, such as optimization, but you have not reached there yet). Not normalized data will cause problems, in your specific case, such as what will you do when a shop needs to change it name? Well, you will have to update all matching rows in the product table.
  3. There are many things you should keep in mind, but it's out of scope for this answer. I suggest that you get to work and learn as you go, because learning by doing is a great way become a better developer. Then when you hit a specific problem, search for/post it here on stackoverflow.

Upvotes: 0

Jyce
Jyce

Reputation: 1

Generally you should index fields that will be intensively used. But using wildcard for your search won't help much. Better use another table with foreign key. Also shouldn't your "id" field in your products table be define as PRIMARY KEY ?

Upvotes: 0

Brian Hoover
Brian Hoover

Reputation: 7991

Nikolai, The ID should be a primary key. That automatically puts an index on ID, and will speed up any queries that need to get specific products.

The shop table should be a second table, but you should have a 3rd table that joins product with shops. At it's most basic, it would have two fields, shop_id, product_id. This let's you have a single product in multiple shops. These two fields should be foreign keys to the product table and shop table.

If you are ever thinking about having a different price for a product per shop, then the product_store join table should also contain the price, although the base price could be stored in the products table.

Price should be a decimal, so that you can do calculations on the price field.

Upvotes: 1

Nemoden
Nemoden

Reputation: 9056

  1. Price for sure because something tells me you will search over this field and do orderings.
  2. "Premature optimization is a root of all evil" (c) Donald Knuth. So, I suggest to normalize your tables, so YES - create table for shops. Once your applicated grown big, and you faced to highloads, you will be able to denormalize your database to avoid JOINS (one way to optimize your voracious application)
  3. Get back to stackoverflow with your problem ;-)

Upvotes: 0

Jim
Jim

Reputation: 22656

1) You should generally index fields that are commonly used. However since your search on name uses a wildcard at the start an index will have no effect on this query.

2) Creating a shops table and linking to this would be better.

Upvotes: 0

Related Questions