Dap
Dap

Reputation: 2359

Select several max types for each datatype per distinct value in mysql

userid      data_type,          timespentaday
 1       League of Legends          500
 1       Hearthstone                1500
 1       Hearthstone                1400
 2       World of Warcraft          1200
 1       Dota 2                     100
 2       Final Fantasy              500
 1       Dota 2                     700

Given this data. I would like to query the most time each user has spent on every.

Output desired:

User    League Of Legends    Hearthstone   World of Warcraft     Dota 2    
 1           500                1500             0                  700
 2           0                  0                1200               0

Something along the lines of this is something I've tried

SELECT t1.* FROM user_info GROUP BY userid JOIN(
 SELECT(
         (SELECT max(timespentaday) where data_type='League of Legends'),
          (SELECT max(timespentaday) where data_type='Hearhstone'),
          (SELECT max(timespentaday) where data_type='Dota 2)'
FROM socialcount AS t2
) as t2
ON t1.userid = t2.userid

Upvotes: 2

Views: 114

Answers (3)

Mike Brant
Mike Brant

Reputation: 71424

I am not going to give you a query with the output format you desire, as implementing that pivot table is going to be a very ugly and poorly performing query, as well as something that is not scalable as the number of distinct games increases.

Instead, I will focus on how to query the data in the most straightforward manner and how to read it into a data structure that would be used by application logic to create the pivot view as desired.

First the query:

SELECT
  userid,
  data_type,
  MAX(timespentaday) AS max_timespent
FROM social_count
GROUP BY userid, data_type

This would give results like

userid    data_type               max_timespent
------    ---------               -------------
1         League of Legends       500
1         Hearthstone             1500
1         Dota 2                  700
2         World of Warcraft       1200
2         Final Fantasy           500

Now when reading the results out of the database, you just read it into a structure that is useful. I will use PHP as example language, but this should be pretty easily portable to any langauge

// will hold distinct list of all available games
$games_array = array();
// will hold user data from DB
$user_data = array();
while ($row = /* your database row fetch mechanism here */) {
    // update games array as necessary
    if (!in_array($row['data_type'], $games_array)) {
        // add this game to $games_array as it does not exist there yet
        $games_array[] = $row['data_type'];
    }
    // update users array
    $users[$row['userid']][$row['data_type']] = $row['max_timespent'];
}

// build pivot table
foreach($users as $id => $game_times) {
    // echo table row start
    // echo out user id in first element
    // then iterate through available games
    foreach($games_array as $game) {
        if(!empty($game_times[$game])) {
            // echo $game_times['game'] into table element
        } else {
            // echo 0 into table element
        }
    }
    // echo table row end
}

Upvotes: 1

You will not be able to build a query with a dynamic number of columns. You can do this query if you already know the game list, which I guess is not what you need. BUT you can always post-process your results with any programming language, so you only have to retrieve the data.

The SQL query would look like this:

SELECT
  userid AS User,
  data_type AS Game,
  max(timespentaday) AS TimeSpentADay
FROM
  my_table
GROUP BY
  userid
  data_type

Then iterate over the results to fill any interface you want

OR

If and only if you can't afford any post-processing of any kind, you can retrieve the list of games first THEN you can build a query like the query below. Please bear in mind that this query is a lot less maintainable than the previous (beside being more difficult to build) and can and will cause you a lot of pain later in debugging.

SELECT
  userid AS User,
  max(CASE 
    WHEN data_type = 'Hearthstone' THEN timespentaday 
    ELSE NULL 
  END) AS Hearthstone,
  max(CASE 
    WHEN data_type = 'League Of Legends' THEN timespentaday 
    ELSE NULL 
  END) AS `League Of Legends`,
  ...
FROM
  my_table
GROUP BY
  userid

The CASE contstruction is like an if in a procedural programming language, the following

CASE 
    WHEN data_type = 'League Of Legends' THEN timespentaday 
    ELSE NULL 
END

Is evaluated to the value of timespentaday if the game is League Of Legends, and to NULL otherwise. The max aggregator simply ignore the NULL values.

Edit: added warning on the second query to explain the caveat of using a generated query thanks to Mike Brant's comment

Upvotes: 0

John Ruddell
John Ruddell

Reputation: 25862

basically to do this you need the greatest n per group.. there is a good article on it but the gist is in mysql you have to use variables to even get close to this.. especially with doing a pivot on the table (a fake pivot since MySQL doesn't have native support for that).

SELECT userid,
    MAX(CASE WHEN data_type = "League of Legends" THEN timespentaday ELSE 0 END) as "League of Legends",
    MAX(CASE WHEN data_type = "Hearthstone" THEN timespentaday ELSE 0 END) as "Hearthstone",
    MAX(CASE WHEN data_type = "Dota 2" THEN timespentaday ELSE 0 END) as "Dota 2",
    MAX(CASE WHEN data_type = "World of Warcraft" THEN timespentaday ELSE 0 END) as "World of Warcraft",
    MAX(CASE WHEN data_type = "Final Fantasy" THEN timespentaday ELSE 0 END) as "Final Fantasy"
FROM
(   SELECT *, @A := if(@B = userid, if(@C = data_type, @A + 1, 1), 1) as count_to_use, @B := userid, @C := data_type
    FROM
    (   SELECT userid, timespentaday, data_type
        FROM gamers
        CROSS JOIN(SELECT @A := 0, @B := 0, @C := '') temp
        ORDER BY userid ASC, data_type ASC, timespentaday DESC
    ) t
    HAVING count_to_use = 1
)t1
GROUP BY userid

DEMO

NOTE:

MySQL DOCS is quite clear on warnings about using user defined variables:

As a general rule, you should never assign a value to a user variable and read the value within the same statement. You might get the results you expect, but this is not guaranteed. The order of evaluation for expressions involving user variables is undefined and may change based on the elements contained within a given statement; in addition, this order is not guaranteed to be the same between releases of the MySQL Server. In SELECT @a, @a:=@a+1, ..., you might think that MySQL will evaluate @a first and then do an assignment second. However, changing the statement (for example, by adding a GROUP BY, HAVING, or ORDER BY clause) may cause MySQL to select an execution plan with a different order of evaluation.

Upvotes: 2

Related Questions