Zero
Zero

Reputation: 76947

MySql query to pull data in matrix form from database

I have data in database with two columns name and year. Something like this,

name | year
-----------
tim  | 2001
ron  | 2002
tim  | 2003
ron  | 2005
tim  | 2004
pol  | 2002
pol  | 2001
tim  | 2003
pol  | 2004
tim  | 2002

I want to output the result matrix something like this.

      tim | pol | ron
2001   1  |  1  |  0
2002   1  |  1  |  1
2003   2  |  0  |  0
2004   1  |  1  |  0
2005   0  |  0  |  1

tim | pol | ron is arranged in descending order based on cumulative sum i.e tim(5), pol(3), ron(2).

Data is not limited to tim,pol,ron. There can be n-different names.

Upvotes: 0

Views: 11904

Answers (3)

RandomSeed
RandomSeed

Reputation: 29769

While fthiella's approach is the most common way to build a pivot table, you may want to try this query, which is faster in most cases:

SELECT
  years.year,
  COALESCE(tim.c, 0) AS tim,
  COALESCE(pol.c, 0) AS pol,
  COALESCE(pol.c, 0) AS ron
FROM (SELECT year FROM yourtable GROUP BY year) AS years
LEFT JOIN (
  SELECT COUNT(name) AS c, year FROM yourtable WHERE name = 'tim' GROUP BY year
) AS tim ON (tim.year = years.year)
LEFT JOIN (
  SELECT COUNT(name) AS c, year FROM yourtable WHERE name = 'pol' GROUP BY year
) AS pol ON (pol.year = years.year)
LEFT JOIN (
  SELECT COUNT(name) AS c, year FROM yourtable WHERE name = 'ron' GROUP BY year
) AS ron ON (ron.year = years.year)
GROUP BY years.year;

This is a bit more painful to write, and to perform well, it requires two indexes:

  • a two-columns index on both fields used as column and row discriminant, in this order (ie. in your case, on (name, year) in this order)
  • an index on the row discriminant (ie. in your case, on (year))

But the larger the result (either more rows or more columns or both), the faster it performs compared to the usual approach. I posted some sample data here if anyone feels like playing with it. The corresponding queries are here.

Upvotes: 0

Noam Rathaus
Noam Rathaus

Reputation: 5598

You can do the following then at the presentation layer transpose it.

SELECT yourtable.year, Hits.name, Hits.hit
FROM yourtable
LEFT JOIN (SELECT
 year, name, count(*) as hit
FROM
  yourtable
GROUP BY
   year, name) AS Hits ON yourtable.year = Hits.year
GROUP BY yourtable.year, yourtable.name

It lacks no-hits, i.e. those names that don't fall in a certain year, i.e. you won't see any 0.

Upvotes: 0

fthiella
fthiella

Reputation: 49079

SELECT
  year,
  SUM(name='tim') tim,
  SUM(name='pol') pol,
  SUM(name='ron') ron
FROM
  yourtable
GROUP BY
   year

Please see fiddle here.

EDIT: if you need a dynamic query because the exact number of values could vary, you could use a prepared statement like this:

SELECT
  CONCAT('SELECT year,',
  GROUP_CONCAT(sums),
  ' FROM yourtable GROUP BY year')
FROM (
SELECT CONCAT('SUM(name=\'', name, '\') AS `', name, '`') sums
FROM yourtable
GROUP BY name
ORDER BY COUNT(*) DESC
) s
INTO @sql;

PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

Fiddle is here.

Upvotes: 5

Related Questions