Reputation: 3
I always struggle with dealing with normalised data, and how I display it. Maybe its because I don't fully understand the normalisation rules, like how to get it fully into Boyce-Codd. Performance is not really an issue at this stage, though maintainability of the schema is.
user
ID Name
1 Alice
2 Bob
3 Charlie
skills
ID Name
1 Karate
2 Marksmen
3 Cook
event
ID Name
1 Island
2 Volcano
user-m2m-skill
MemberID SkillID
1 1
1 2
2 1
2 3
3 1
user-m2m-event
MemberID EventID
1 1
1 2
2 1
3 2
How do I get this information out of the database? I'd like to display a table like this, where I've got the total count of each skill:
Skills at event
Event Karate Marksmen Cook
Island 2 1 1
Volcano 2 1 0
It is unlikely that the skills table will change very much. This means I could do a set of subqueries like this (obviously shortened and incorrect syntax)
SELECT event.name,
(SELECT COUNT(*) FROM ... WHERE skill = 'Karate'),
(SELECT COUNT(*) FROM ... WHERE skill = 'Marksmen') FROM event
And that's what I've been doing, putting it into a view. But its a bit horrible, no? I have to edit the view every time I add a new skill.
The other way to to process it client side. So I just get back something like this:
Event Skill Count
Island Karate 2
Island Marksmen 1
Island Cook 1
Volcano Karate 2
Volcano Marksmen 1
And I loop through the results, reformatting it. But I hate that even more. Isn't the database supposed to do data?
So: what am I doing wrong? Am I expecting too much? Which is the lesser evil?
(As b3ta would say, apologies for length of post and for bad markup. :( )
Upvotes: 0
Views: 211
Reputation: 332701
This is a typical pivot query, because you are looking to convert data in rows into columns.
SELECT e.name,
MAX(CASE WHEN x.skill_name = 'Karate' THEN x.num_skill ELSE 0) END AS Karate,
MAX(CASE WHEN x.skill_name = 'Marksmen' THEN x.num_skill ELSE 0 END) AS Marksmen
FROM EVENT e
LEFT JOIN (SELECT um.eventid,
s.name AS skill_name,
COUNT(*) 'num_skill'
FROM SKILLS s
JOIN USER-M2M-SKILL us ON us.skillid = s.id
JOIN USER-M2M-EVENT um ON um.memberid = us.memberid
GROUP BY um.eventid, s.name) x ON x.eventid = e.id
GROUP BY e.name
...what does this have that a load of sub queries doesn't?
SELECTs as statements within the SELECT clause. IE:
SELECT x.name,
(SELECT COUNT(*) FROM TABLE)
...means that a separate query is run for every skill. If the queries were correllated - if they were tied together by an ID to make sure records sync'd with an event, then the count would be running for every event.
The approach is terribly inefficient. It is better to fetch the necessary values once, as I provided in my answer.
Regarding updating the query - it is possible to minimize the maintenance by implementing the query with dynamic SQL.
Upvotes: 1
Reputation: 10147
Who's b3ta?
As far as the database "doing data" that doesn't mean that client code will be free of all parsing and processing. And normalization in practice shouldn't b a goal in itself. You should also take into account ease of querying and performance.
Upvotes: 0
Reputation: 527238
Your second example, with "Event", "Skill", and "Count" as headers, is what you should expect from dynamically generated results from normalized data. Databases are not designed to format data for display (this isn't an Excel spreadsheet), they're designed to store data and return the meaning of that data. It's up to your code to display it in a nice fashion.
Upvotes: 0