Reputation: 805
Let's suppose that database is big. I have a very complex query for search results page. In the below query, you can see that I retrieve from user_profile table some attribute value ids, the education is one attribute for example. When I have value id for attribute education, I retrieve label name for this id from array (php code) where id is array key.
public static $education = array(0 => 'No answer',
1 => 'High school',
2 => 'Some college',
3 => 'In college',
4 => 'College graduate',
5 => 'Grad / professional school',
6 => 'Post grad');
Similar is with about 10 other attributes. Otherwise my query would be even more complex, I would need to create table attribute_id_label and make for each attribute another join to retrieve label name for value id of each attribute. This means extra 10 joins which could slow query. But still this would be the correct way.
So my question is: If table attribute_id_label have only about 500 records. Will 10 joins with this table make any big difference since the table is very small? Even if the table user_profile is very big and the query is already quite complex as you see?
And here is my query:
SELECT
group_concat(DISTINCT looking.looking_for SEPARATOR ',') as lookingFor,
group_concat(DISTINCT photo.photo ORDER BY photo.photo_id DESC SEPARATOR ',') as photos,
profile.user_id as userId,
url as profileUrl,
nickname,
avatar.photo,
city,
ethnicity,
education,
occupation,
income,
//and 10 more fields like education, occupation, ethnicity...
FROM user_profile profile
LEFT JOIN user_profile_photo photo ON photo.user_id=profile.user_id
LEFT JOIN user_profile_photo avatar ON avatar.photo_id=profile.photo_id
INNER JOIN user_profile_looking_for looking ON looking.user_id=profile.user_id
LEFT JOIN user_profile_txt txt ON txt.user_id = profile.user_id
INNER JOIN place a ON a.place_id=profile.place_id
INNER JOIN (SELECT lat, lon FROM place WHERE place_id = :place_id) b ON (3959 * acos( cos( radians(b.lat) ) * cos( radians( a.lat ) ) * cos( radians( a.lon ) - radians(b.lon) ) + sin( radians(b.lat) ) * sin( radians( a.lat ) ) ) ) < :within
GROUP BY profile.user_id LIMIT 0,12
Most attributes won't be filled by user and since you advise non-NULLable, what would be the best to use for those unfilled attributes? I can use for each attribute extra field No answer. Each attribute would have extra value No answer. Let's give attributes education and want for example. Attribute education have id 1, want is 2.
eav_attribute_option
option_id | attr_id | label
1 | 1 | No answer
2 | 1 | High school
3 | 1 | ...
4 | 2 | No answer
5 | 2 | Opportunities
6 | 2 | ...
But now the problem is repeated No answer value for each attribute. But this is the way to avoid NULL values. I am not sure if this is correct.
Upvotes: 0
Views: 1438
Reputation: 108706
I have done a lot of this kind of codelist work. It typically helps performance more than it hurts. @alxklx pointed out the truth: that you must make sure your codelist tables (e.g. education) are well formed. That is,
int
instead of a decimal
or varchar
.If you do these things your JOINs can look this simple
FROM people p
JOIN education e ON p.education_id = e.education_id
and the RDBMS's optimizer knows they're straightforward 1:1 joins.
All that being said, any complex query needs to be examined both for functionality and performance before you put it into a live system.
If you have missing data in your people
use an education_id (or some other attribute_id) of either zero or one. Put a row in each codelist table with id zero or one and a value of "unknown" or "user didn't tell us" or whatever makes sense. (You can choose either zero or one based on convenience for your application. I prefer zero, but that's just personal preference.)
Upvotes: 1
Reputation: 29629
In general - very, very general - joining on a foreign key relationship - i.e. where the attribute_id is indeed a primary key, with corresponding index, with an index-friendly data type like an INT, you can treat the join as effectively free from a performance point of view.
Best way to find out is to try it and ask EXPLAIN to tell you what's going on.
Upvotes: 0
Reputation: 151
Two very major things you need to consider-first is how big are the tables and second-indexes. If an index is missing on a large table or the data type of the field is different from the data type of the field of the table you join it to-it might as well take days or even months. Personally I've done far far bigger selects with enormous tables and the results were pretty good, coming at about 2 seconds. Use explain select to see how the query is standing and if something is not ok-describe your tables, show their indexes and compare. It's really hard to give you a definitive answer if we don't know your database design...
Upvotes: 0