Reputation: 970
Pretend I have this schema:
Table: University
String: name
Integer: studentCount
Table: Event
String: name
Foreign Key: university -> University
Table: Attendance
String: name
Foreign Key: event -> Event
If I want to get all the Attendance rows that are associated with a university with more than 1000 student count, what is the best way to optimize database indices?
Can you have an index chase foreign keys?
Like an index in the Attendance table that has combined: (event__university__studentCount)
Upvotes: 0
Views: 57
Reputation: 562951
An index can only reference columns from one table. There's no such thing as an index that spans multiple tables.
There's also no support at least in MySQL for an index on a count or other aggregation. There are other implementations of SQL database that optimize for aggregation and these might index results of aggregate functions. I'm thinking of OLAP column-stores.
With MySQL you can do a query like this:
SELECT Event.*
FROM Event
JOIN Attendance ON Attendance.Event = Event.id
WHERE Event.University = ?
GROUP BY Event.id
HAVING COUNT(*) > 1000;
The foreign key on Attendance.Event
implicitly creates an index, so the join will be optimized in that way. But the COUNT(*)
really needs to examine every row. There's no way around that in an OLTP database like MySQL.
A solution to optimize this would be to store the current attendee count in the Event table. Then it could be indexed and matching events could be found without a join or aggregation.
SELECT * FROM Event
WHERE University = ? AND AttendeeCount > 1000;
This is an example of denormalization.
Upvotes: 1