Vicky
Vicky

Reputation: 1615

Query optimization using index

I am still confused with this one. This is from an online problem set.

Assume I have tables USER, CHECKIN and PLACE.

USER(uid, uname, ucity), uid is the primary key.

PLACE (pid, pname, pxcoord, pycoord, pcity), pid is the primary key.

CHECKIN (uid, pid, cdate, ctime), (uid, cdate, ctime) is the primary key.

The query is

select c.uid, c.pid c.cdate
from user u natural join checkin c natural join place p 
where ucity='NewYork' and pcity='Chicago'
  1. I might have some misunderstanding on the definition. But only allowed to create up to two index structures, how should we choose between ordered index and B+ tree index?

  2. For check in table, is there two index (uid, pid) we should create on? If so, will it be any difference which one to be the primary or secondary index?

I copied original questions here

"Suppose for each query, you could create up to two index structures to make the query fast, what index structures would you create, and how would this change the evaluation plans and running time (in other words, fetching a single record with a particular non-key value using these indexes)."

Upvotes: 0

Views: 32

Answers (1)

Rick James
Rick James

Reputation: 142298

"No one" uses NATURAL JOIN. Please change to the JOIN ... ON ... syntax so that you are explicit about how the tables are related. Also, please provide SHOW CREATE TABLE for each table.

"Only 2 index structures"? Where did you get that? InnoDB 'requires' one PRIMARY KEY and up to 64 'secondary keys'. The PRIMARY KEY is stored with the data and orders the data that way. The secondary keys have the primary key in them, so they do a second lookup to get to the data. Both primary and secondary keys are in a BTree structure.

A PRIMARY KEY is, by MySQL definition, UNIQUE and an index. So, give the table a 'natural' PRIMARY KEY based on whatever column (or combination of columns) is unique in the table. Or create a surrogate AUTO_INCREMENT key. Then make whatever other keys you may need for looking up and/or JOINing.

Yes, indexes make query run faster. When you get some tables with millions of rows, we can discuss subtle issues beyond that.

In a SELECT that does not use any indexes, the entire table will be scanned (a "table scan"). This is 'slow', especially if the table is 'large'. But it is OK.

Looking up a single row (a "point query") is fast if it can use an index. It is faster if it can use the PRIMARY KEY, but we are talking about 1 millisecond versus 2 milliseconds for typical situations. Even for a billion row table, we are talking about 10ms vs 20ms. On the the other hand, a table scan of a billion rows might take hours. That's an extreme example of why indexes are important.

In your query, you would want

INDEX(ucity)

If you provide EXPLAIN SELECT ... of your query and tables as they stand, then add that index and run the EXPLAIN again, you will see a difference. We can discuss how things change for the better and how a table scan went away.

See my cookbook for more discussion on how to write optimal indexes.

Another tip: Don't separate DATE and TIME into two separate fields. It is a lot easier to split a DATETIME when needed than it is to put two fields together.

Upvotes: 0

Related Questions