Reputation: 2359
userid data_type, timespentaday
1 League of Legends 500
1 Hearthstone 1500
1 Hearthstone 1400
2 World of Warcraft 1200
1 Dota 2 100
2 Final Fantasy 500
1 Dota 2 700
Given this data. I would like to query the most time each user has spent on every.
Output desired:
User League Of Legends Hearthstone World of Warcraft Dota 2
1 500 1500 0 700
2 0 0 1200 0
Something along the lines of this is something I've tried
SELECT t1.* FROM user_info GROUP BY userid JOIN(
SELECT(
(SELECT max(timespentaday) where data_type='League of Legends'),
(SELECT max(timespentaday) where data_type='Hearhstone'),
(SELECT max(timespentaday) where data_type='Dota 2)'
FROM socialcount AS t2
) as t2
ON t1.userid = t2.userid
Upvotes: 2
Views: 114
Reputation: 71424
I am not going to give you a query with the output format you desire, as implementing that pivot table is going to be a very ugly and poorly performing query, as well as something that is not scalable as the number of distinct games increases.
Instead, I will focus on how to query the data in the most straightforward manner and how to read it into a data structure that would be used by application logic to create the pivot view as desired.
First the query:
SELECT
userid,
data_type,
MAX(timespentaday) AS max_timespent
FROM social_count
GROUP BY userid, data_type
This would give results like
userid data_type max_timespent
------ --------- -------------
1 League of Legends 500
1 Hearthstone 1500
1 Dota 2 700
2 World of Warcraft 1200
2 Final Fantasy 500
Now when reading the results out of the database, you just read it into a structure that is useful. I will use PHP as example language, but this should be pretty easily portable to any langauge
// will hold distinct list of all available games
$games_array = array();
// will hold user data from DB
$user_data = array();
while ($row = /* your database row fetch mechanism here */) {
// update games array as necessary
if (!in_array($row['data_type'], $games_array)) {
// add this game to $games_array as it does not exist there yet
$games_array[] = $row['data_type'];
}
// update users array
$users[$row['userid']][$row['data_type']] = $row['max_timespent'];
}
// build pivot table
foreach($users as $id => $game_times) {
// echo table row start
// echo out user id in first element
// then iterate through available games
foreach($games_array as $game) {
if(!empty($game_times[$game])) {
// echo $game_times['game'] into table element
} else {
// echo 0 into table element
}
}
// echo table row end
}
Upvotes: 1
Reputation: 8610
You will not be able to build a query with a dynamic number of columns. You can do this query if you already know the game list, which I guess is not what you need. BUT you can always post-process your results with any programming language, so you only have to retrieve the data.
The SQL query would look like this:
SELECT
userid AS User,
data_type AS Game,
max(timespentaday) AS TimeSpentADay
FROM
my_table
GROUP BY
userid
data_type
Then iterate over the results to fill any interface you want
OR
If and only if you can't afford any post-processing of any kind, you can retrieve the list of games first THEN you can build a query like the query below. Please bear in mind that this query is a lot less maintainable than the previous (beside being more difficult to build) and can and will cause you a lot of pain later in debugging.
SELECT
userid AS User,
max(CASE
WHEN data_type = 'Hearthstone' THEN timespentaday
ELSE NULL
END) AS Hearthstone,
max(CASE
WHEN data_type = 'League Of Legends' THEN timespentaday
ELSE NULL
END) AS `League Of Legends`,
...
FROM
my_table
GROUP BY
userid
The CASE
contstruction is like an if
in a procedural programming language, the following
CASE
WHEN data_type = 'League Of Legends' THEN timespentaday
ELSE NULL
END
Is evaluated to the value of timespentaday
if the game is League Of Legends, and to NULL
otherwise. The max
aggregator simply ignore the NULL
values.
Edit: added warning on the second query to explain the caveat of using a generated query thanks to Mike Brant's comment
Upvotes: 0
Reputation: 25862
basically to do this you need the greatest n per group.. there is a good article on it but the gist is in mysql you have to use variables to even get close to this.. especially with doing a pivot on the table (a fake pivot since MySQL doesn't have native support for that).
SELECT userid,
MAX(CASE WHEN data_type = "League of Legends" THEN timespentaday ELSE 0 END) as "League of Legends",
MAX(CASE WHEN data_type = "Hearthstone" THEN timespentaday ELSE 0 END) as "Hearthstone",
MAX(CASE WHEN data_type = "Dota 2" THEN timespentaday ELSE 0 END) as "Dota 2",
MAX(CASE WHEN data_type = "World of Warcraft" THEN timespentaday ELSE 0 END) as "World of Warcraft",
MAX(CASE WHEN data_type = "Final Fantasy" THEN timespentaday ELSE 0 END) as "Final Fantasy"
FROM
( SELECT *, @A := if(@B = userid, if(@C = data_type, @A + 1, 1), 1) as count_to_use, @B := userid, @C := data_type
FROM
( SELECT userid, timespentaday, data_type
FROM gamers
CROSS JOIN(SELECT @A := 0, @B := 0, @C := '') temp
ORDER BY userid ASC, data_type ASC, timespentaday DESC
) t
HAVING count_to_use = 1
)t1
GROUP BY userid
MySQL DOCS is quite clear on warnings about using user defined variables:
As a general rule, you should never assign a value to a user variable and read the value within the same statement. You might get the results you expect, but this is not guaranteed. The order of evaluation for expressions involving user variables is undefined and may change based on the elements contained within a given statement; in addition, this order is not guaranteed to be the same between releases of the MySQL Server. In SELECT @a, @a:=@a+1, ..., you might think that MySQL will evaluate @a first and then do an assignment second. However, changing the statement (for example, by adding a GROUP BY, HAVING, or ORDER BY clause) may cause MySQL to select an execution plan with a different order of evaluation.
Upvotes: 2