Reputation: 163
I have a question about how to pivot the table in MySQL. I have a dataset, columns like this:
ID Name job_title
1 Sam Fireman
2 Tomas Driver
3 Peter Fireman
4 Lisa Analyst
5 Marcus Postman
6 Stephan Analyst
7 Mary Research Manager
8 Albert Analyst
9 Chen Driver
...etc...
And I want to generate a table like this:
Fireman Driver Analyst Postman Research Manager ...
Sam Tomas Lisa Marcus Mary
Peter Chen Stephan (someone) (someone)...
....etc...
Since, this is just a sample from the datasets, so I may not know how much different job titles in the dataset. The goal is to list every person in the different job title columns.
Is there any methods to do that? Or is it possible to generate such table in MySQL? An engineer told me that it can done by creating a view, but I do not know how. I read some books, and still confused.
Any ideas and SQL queries guides are welcome!
Upvotes: 11
Views: 21871
Reputation: 1800
This solution perfectly matches my problem statement, your problem statement is similar to mine
In your case, the job title is dynamic, which means you are not sure about the possible job titles, The logic will be , get the unique job_titles and then go through all the rows using the window function.
To build the final query we will divide it into a few parts then we concant the query to make the final result
@sql
variableROW_NUMBER() OVER (PARTITION BY job_title ORDER BY Name) AS row_num
to check the number of repetitions and stored this query in @main_query
and concat with @sql
SET @sql = NULL;
#Step 1: Build the SELECT statement dynamically
SELECT GROUP_CONCAT(
CONCAT(
'COALESCE(MAX(CASE WHEN job_title = ''', job_title, ''' THEN Name END), ''(Someone)'') AS `', job_title, '`'
)
) INTO @sql
FROM (SELECT DISTINCT job_title FROM pivot_table) AS t;
# Step 2: Combine it into the final query with a CTE
SET @main_query = CONCAT(
'WITH base AS (
SELECT
Name,
job_title,
ROW_NUMBER() OVER (PARTITION BY job_title ORDER BY Name) AS row_num
FROM pivot_table
)
SELECT row_num, ',
@sql,
' FROM base
GROUP BY row_num
ORDER BY row_num;'
);
#show the final query, it won't execute here you can remove this line of you want
select @main_query;
# Step 3: Prepare and execute the dynamic SQL
PREPARE stmt FROM @main_query;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
our final query will be built like this
WITH base AS (
SELECT
Name,
job_title,
ROW_NUMBER() OVER (PARTITION BY job_title ORDER BY Name) AS row_num
FROM pivot_table
)
SELECT row_num, COALESCE(MAX(CASE WHEN job_title = 'Fireman' THEN Name END), '(Someone)') AS `Fireman`,COALESCE(MAX(CASE WHEN job_title = 'Driver' THEN Name END), '(Someone)') AS `Driver`,COALESCE(MAX(CASE WHEN job_title = 'Analyst' THEN Name END), '(Someone)') AS `Analyst`,COALESCE(MAX(CASE WHEN job_title = 'Postman' THEN Name END), '(Someone)') AS `Postman`,COALESCE(MAX(CASE WHEN job_title = 'Research' THEN Name END), '(Someone)') AS `Research`
FROM base
GROUP BY row_num
ORDER BY row_num;
Upvotes: 0
Reputation: 1
Without using table view, you can get the table using SQL CTE(Common Table Expression) and partition columns by ROW_NUMBER(). This is what I have tried out to get the table.
create table Employees( ID int PRIMARY KEY, Name varchar(50), Job_title varchar(50) )
Insert Into Employees values(1, 'Sam', 'Fireman'); Insert Into Employees values(2, 'Tomas', 'Driver'); Insert Into Employees values(3, 'Peter', 'Fireman'); Insert Into Employees values(4, 'Lisa', 'Analyst'); Insert Into Employees values(5, 'Marcus', 'Postman'); Insert Into Employees values(6, 'Stephan', 'Analyst'); Insert Into Employees values(7, 'Mary', 'Research Manager'); Insert Into Employees values(8, 'Albert', 'Analyst'); Insert Into Employees values(9, 'Chen', 'Driver');
WITH ranked_data AS ( SELECT Name, job_title, ROW_NUMBER() OVER (PARTITION BY job_title ORDER BY ID) AS row_num FROM Employees ) SELECT MAX(CASE WHEN job_title = 'Fireman' THEN Name END) AS Fireman, MAX(CASE WHEN job_title = 'Driver' THEN Name END) AS Driver, MAX(CASE WHEN job_title = 'Analyst' THEN Name END) AS Analyst, MAX(CASE WHEN job_title = 'Postman' THEN Name END) AS Postman, MAX(CASE WHEN job_title = 'Research Manager' THEN Name END) AS `Research Manager` FROM ranked_data GROUP BY row_num ORDER BY row_num;
Output
Fireman | Driver | Analyst | Postman | Research Manager |
---|---|---|---|---|
Sam | Tomas | Lisa | Marcus | Mary |
Peter | Chen | Stephan | ||
Albert |
Upvotes: -1
Reputation: 52
If you want to easily creat pivot tables (also known as cross tabulation) from any MySQL database where raws of any table are converted to dynamic columns, I recommend Smart Pivot table. This tool uses a very an easy to use wizard-style interface to generate pivot tables which you later can export to MS Excel.
Smart Pivot table is a dynamic PHP reporting tool which means it automatically updates your pivot tables whenever your database is updated.
Generated Pivot table from MySQL DB
Upvotes: -1
Reputation: 1
It's better to start from the result and try to map to the original table. Basically each row of the result table should be in the same group in the original table. And the CTE table grouping and rank window function creates the group order by name.
with grouping as (
select
Name,
job_title,
rank() over (partition by job_title order by name) as rnk
from jobs
)
select
group_concat(if(g.job_title = 'Fireman', g.Name, NULL)) as 'Fireman',
group_concat(if(g.job_title = 'Driver',g.Name, NULL)) as 'Driver',
group_concat(if(g.job_title = 'Analyst', g.Name, NULL)) as 'Analyst',
group_concat(if(g.job_title = 'Research Manager', g.Name, NULL)) as 'Research Manager'
from grouping g
group by g.rnk
order by g.rnk
Upvotes: -1
Reputation: 1
Look at JSON services (JSON_OBJECTAGG,JSON_OBJECT)
, it can be parsed in java with basin object mapping (Jackson).
select xyz, JSON_OBJECTAGG( a, b) as pivit_point
from ... group by xyz;
Upvotes: -1
Reputation: 647
I also encountered this problem on HackerRank. While I think the group_concat
answer is very good and is typically used in these sorts of pivot situations with earlier versions of MySql, I find that concat
and group_concat
can be difficult to read and understand.
If your version of MySql supports window functions then you can solve this using temporary tables, as MySql does not support outer joins. You'll need a separate temp table for every pivot column to avoid Window function is not allowed in window specification errors:
use test;
drop table if exists occupations;
create table if not exists occupations (
name varchar(50)
,occupation varchar(50)
);
insert into occupations (name, occupation) select 'Samantha', 'Doctor'
union all select 'Julia', 'Actor'
union all select 'Maria', 'Actor'
union all select 'Meera', 'Singer'
union all select 'Ashley', 'Professor'
union all select 'Kelly', 'Professor'
union all select 'Christeen', 'Professor'
;
-- the way to approach this in mysql is to create a temp table with ordinals.
-- then upsert with four queries using row_number()
-- nb full join not supported. let's try temp table
drop table if exists doctors;
create temporary table doctors
(
name varchar(50)
,occupation varchar(50)
,ordinal int
);
insert into doctors
select
name
,occupation
,row_number() over (partition by occupation order by name) as ordinal
from occupations
where occupation = 'Doctor'
;
drop table if exists actors;
create temporary table actors
(
name varchar(50)
,occupation varchar(50)
,ordinal int
);
insert into actors
select
name
,occupation
,row_number() over (partition by occupation order by name) as ordinal
from occupations
where occupation = 'Actor'
;
drop table if exists professors;
create temporary table professors
(
name varchar(50)
,occupation varchar(50)
,ordinal int
);
insert into professors
select
name
,occupation
,row_number() over (partition by occupation order by name) as ordinal
from occupations
where occupation = 'Professor'
;
drop table if exists singers;
create temporary table singers
(
name varchar(50)
,occupation varchar(50)
,ordinal int
);
insert into singers
select
name
,occupation
,row_number() over (partition by occupation order by name) as ordinal
from occupations
where occupation = 'Singer'
;
-- upsert: update if not exists
drop table if exists results;
create temporary table results
(
singer varchar(50)
,actor varchar(50)
,doctor varchar(50)
,professor varchar(50)
,ordinal int primary key
);
insert into results (singer, ordinal)
select name, ordinal from singers
on duplicate key update singer = name
;
insert into results (actor, ordinal)
select name, ordinal from actors
on duplicate key update actor = name
;
insert into results (doctor, ordinal)
select name, ordinal from doctors
on duplicate key update doctor = name
;
insert into results (professor, ordinal)
select name, ordinal from professors
on duplicate key update professor = name
;
select singer, actor, doctor, professor from results;
Ps. I have to disagree with earlier comments: this is a pivot. We are projecting rows into columns, with the rows being a projection of occupations and ordinals.
Upvotes: 1
Reputation: 17640
There are 3 things to think about 1) How to dynamically generate a bunch of max(case when 2) assigning something to group the case when's by - in this case I generate a row number using a variable 3) some of your job titles contain white space which I remove for the generation of column headers
set @sql =
(select concat('select ', gc, ' from
(select name,job_title,
if (job_title <> @p, @rn:=1 ,@rn:=@rn+1) rn,
@p:=job_title p
from t
cross join (select @rn:=0,@p:=null) r
order by job_title
) s group by rn;') from
(select
group_concat('max(case when job_title = ', char(39),job_title ,char(39),' then name else char(32) end ) as ',replace(job_title,char(32),'')) gc
from
(
select distinct job_title from t
) s
) t
)
;
Generates this sql code
select max(case when job_title = 'Fireman' then name else char(32) end ) as Fireman,
max(case when job_title = 'Driver' then name else char(32) end ) as Driver,
max(case when job_title = 'Analyst' then name else char(32) end ) as Analyst,
max(case when job_title = 'Postman' then name else char(32) end ) as Postman,
max(case when job_title = 'Research Manager' then name else char(32) end ) as ResearchManager
from
(select name,job_title,
if (job_title <> @p, @rn:=1 ,@rn:=@rn+1) rn,
@p:=job_title p
from t
cross join (select @rn:=0,@p:=null) r
order by job_title
) s group by rn;
Which can be submitted to dynamic sql
prepare sqlstmt from @sql;
execute sqlstmt;
deallocate prepare sqlstmt;
result
+---------+--------+---------+---------+-----------------+
| Fireman | Driver | Analyst | Postman | ResearchManager |
+---------+--------+---------+---------+-----------------+
| Sam | Tomas | Lisa | Marcus | Mary |
| Peter | Chen | Stephan | | |
| | | Albert | | |
+---------+--------+---------+---------+-----------------+
3 rows in set (0.00 sec)
Upvotes: 3
Reputation: 155250
The desired output data you posted is not an example of pivoted data because values in the same row have no relationship with each other, it sounds like you just want a compact representation of everyone on a per-cell basis. This makes it a view-level concern and should not be performed in SQL, but should be performed in your view-level (presumably a PHP web-page, as you're using MySQL).
Your output data is column-oriented, not row-oriented, but HTML tables (and most datagrid components for other platforms like WinForms, Java and WPF) are row-oriented, so you'll need to think about how to do it.
Assuming you're targeting HTML and taking into account the row-oriented vs. column-oriented conversion required, try this (pseudo-code)
define type DBResultRow {
id: int,
name: string,
job_title: string
}
let rows : List<DBResultRow> = // get rows from your view, no changes to SQL required
let jobTitles : List<String>
let jobTitleMap : Map<String,Int32>
let outputTable : List<List<String>>
foreach( person: DBResultRow in rows )
{
let columnIdx = jobTitleMap[ person.job_title ];
if( !columnIdx )
{
jobTitles.Add( person.job_title );
columnIdx = jobTitles.Count - 1;
jobTitleMap[ person.job_title, columnIdx ];
}
outputTable[ columnIdx ].Add( person.name );
}
let longestColumnLength = outputTable.Select( col => col.Count ).Max();
Then render to HTML:
<table>
<thead>
<tr>
foreach( jobTitle: String in jobTitles )
{
<th><%= jobTitle #></th>
}
</tr>
</thead>
<tbody>
for( row = 0; row < longestColumnLength; row++ )
{
<tr>
for( col = 0; col < jobTitles.Count; col++ )
{
if( row > outputTable[ col ].Count )
{
<td></td>
}
else
{
<td><%= outputTable[ col ][ row ] %></td>
}
}
</tr>
}
</tbody>
</table>
Upvotes: -1