Reputation: 470
I'm trying to get pivot table using WITH
clause, but i'm stuck on how to return multiple or duplicate rows using MAX()
.
Here is my original query:
SELECT b.detail_id, a.sampling_date, a.sampling_area,
b.sampling_point, b.sampling_type,
b.ha_tpc, b.ha_entero, b.ha_ecoli, b.ha_salmonella
FROM tbl_header a
JOIN tbl_detail b ON a.headerid = b.headerid
WHERE
a.sampling_date = '2016-12-09' AND
a.sampling_area = 'CMP' AND
(b.sampling_point ~* '.*(flex).*' OR b.sampling_point ~* '.*(HPM).*' OR b.sampling_point ~* '.*(fr).*') AND
(b.sampling_type = 'Personil' OR b.sampling_type = 'Equipment')
And here is the results:
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
detail_id | sampling_date | sampling_area | sampling_point | sampling_type | ha_tpc | ha_entero | ha_ecoli | ha_salmonella |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
24243 | 2016-12-09 | CMP | BOIL013 (OPERATOR ENFLEX) | Personil | 500 | 50 | Abs | Abs |
24289 | 2016-12-09 | CMP | MP115 (OPR ENFLEX) | Personil | 300 | 50 | Abs | Abs |
24284 | 2016-12-09 | CMP | WT033 (FR) | Personil | 250 | 50 | Abs | Abs |
With those records i try to create a pivot query like the following:
WITH tmp_date AS (
SELECT sampling_date.sampling_date::date AS sampling_date
FROM generate_series(
((
SELECT min(tbl_header.sampling_date) AS min
FROM tbl_header
))::timestamp with time zone,
((
SELECT max(tbl_header.sampling_date) AS max
FROM tbl_header
))::timestamp with time zone, '1 day'::interval
) sampling_date(sampling_date)
),
tmp_detail AS (
SELECT a.sampling_date, a.sampling_area,
b.detail_id, b.sampling_point, b.sampling_type,
b.ha_tpc, b.ha_entero, b.ha_ecoli, b.ha_salmonella
FROM tbl_header a
JOIN tbl_detail b ON a.headerid = b.headerid
),
resulttable AS (
SELECT tmp_date.sampling_date, tmp_detail.sampling_point, tmp_detail.sampling_type,
-- Case of Filling Room
CASE
WHEN tmp_detail.sampling_point ~* '.*(fr).*' AND tmp_detail.sampling_type = 'Personil' THEN tmp_detail.sampling_point
ELSE NULL
END AS fr_name,
CASE
WHEN tmp_detail.sampling_point ~* '.*(fr).*' AND tmp_detail.sampling_type = 'Personil' THEN tmp_detail.ha_tpc
ELSE NULL
END AS fr_tpc,
CASE
WHEN tmp_detail.sampling_point ~* '.*(fr).*' AND tmp_detail.sampling_type = 'Personil' THEN tmp_detail.ha_entero
ELSE NULL
END AS fr_entero,
CASE
WHEN tmp_detail.sampling_point ~* '.*(fr).*' AND tmp_detail.sampling_type = 'Personil' THEN tmp_detail.ha_ecoli
ELSE NULL
END AS fr_ecoli,
CASE
WHEN tmp_detail.sampling_point ~* '.*(fr).*' AND tmp_detail.sampling_type = 'Personil' THEN tmp_detail.ha_salmonella
ELSE NULL
END AS fr_salmo,
-- Case of Hopper Auger Filling
CASE
WHEN (tmp_detail.sampling_point ~* '.*(flex).*' OR tmp_detail.sampling_point ~* '.*(HPM).*') AND tmp_detail.sampling_type = 'Personil' THEN tmp_detail.sampling_point
ELSE NULL
END AS hpm_name,
CASE
WHEN (tmp_detail.sampling_point ~* '.*(flex).*' OR tmp_detail.sampling_point ~* '.*(HPM).*') AND tmp_detail.sampling_type = 'Personil' THEN tmp_detail.ha_tpc
ELSE NULL
END AS hpm_tpc,
CASE
WHEN (tmp_detail.sampling_point ~* '.*(flex).*' OR tmp_detail.sampling_point ~* '.*(HPM).*') AND tmp_detail.sampling_type = 'Personil' THEN tmp_detail.ha_entero
ELSE NULL
END AS hpm_entero,
CASE
WHEN (tmp_detail.sampling_point ~* '.*(flex).*' OR tmp_detail.sampling_point ~* '.*(HPM).*') AND tmp_detail.sampling_type = 'Personil' THEN tmp_detail.ha_ecoli
ELSE NULL
END AS hpm_ecoli,
CASE
WHEN (tmp_detail.sampling_point ~* '.*(flex).*' OR tmp_detail.sampling_point ~* '.*(HPM).*') AND tmp_detail.sampling_type = 'Personil' THEN tmp_detail.ha_salmonella
ELSE NULL
END AS hpm_salmo
FROM tmp_date
FULL JOIN tmp_detail ON tmp_date.sampling_date = tmp_detail.sampling_date
WHERE
tmp_detail.sampling_area = 'CMP' AND
tmp_detail.sampling_type = 'Personil' AND
(
tmp_detail.sampling_point ~* ANY ( VALUES ('(fr)'), ('(flex)'), ('(HPM)') )
)
)
SELECT
resulttable.sampling_date,
max(resulttable.sampling_type) AS sampling_type,
max(resulttable.fr_name) AS fr_name,
max(resulttable.fr_tpc) AS fr_tpc,
max(resulttable.fr_entero) AS fr_entero,
max(resulttable.fr_ecoli) AS fr_ecoli,
max(resulttable.fr_salmo) AS fr_salmonella,
max(resulttable.hpm_name) AS hopper_name,
max(resulttable.hpm_tpc) AS hopper_tpc,
max(resulttable.hpm_entero) AS hopper_entero,
max(resulttable.hpm_ecoli) AS hopper_ecoli,
max(resulttable.hpm_salmo) AS hopper_salmonella
FROM resulttable
GROUP BY resulttable.sampling_date ORDER BY resulttable.sampling_date;
I don't know why those query only return 1 rows, is it because i used GROUP BY
?:
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
sampling_date | sampling_type | fr_name | fr_tpc | fr_entero | fr_ecoli | fr_salmonella | hopper_name | hopper_tpc | hopper_entero | hopper_ecoli | hopper_salmonella |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2016-12-09 | Personil | WT033 (FR) | 250 | 50 | Abs | Abs | BOIL013 (OPERATOR ENFLEX) | 500 | 50 | Abs | Abs |
I actually want to get a result like the following:
===================== Case of Filling Room =========================== | ===================== Case of Hopper Auger Filling ============================
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
sampling_date | sampling_type | fr_name | fr_tpc | fr_entero | fr_ecoli | fr_salmonella | hopper_name | hopper_tpc | hopper_entero | hopper_ecoli | hopper_salmonella |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2016-12-09 | Personil | WT033 (FR) | 250 | 50 | Abs | Abs | BOIL013 (OPERATOR ENFLEX) | 500 | 50 | Abs | Abs |
2016-12-09 | Personil | | | | | | MP115 (OPR ENFLEX) | 300 | 50 | Abs | Abs |
Any help would be appreciated, thank you.
Upvotes: 0
Views: 112
Reputation: 26454
Your problem is that your GROUP BY
clause specifies criteria that is not specific enough for your needs. Since you only have one sampling date, you get only one row out.
Reading your output, I think you want to add hopper_name to the GROUP BY
terms. However it is not 100% clear what you are trying to do so understanding this clause and other alternatives may be important.
GROUP BY without grouping sets
This is currently what you are doing. It says, basically, for every row which is unique across this criteria, return one row out. This is probably what you want. So make sure you have here GROUP BY referencing exactly what you want.
Alternatives: Grouping Sets with Rollup and cube
A complex solution to be aware of is that PostgreSQL allows you to have multiple grouping sets along with rolling up based on criteria. You might want to look at ROLLUP
or CUBE
for this if GROUP BY
does not get you where you want to go. However, this complicates things a bit in retrieving the data because you want to get a bit map of your grouped columns. However since you are trying to do a pivot table, that may be worth noting for further steps forward (as well as pointing other to as well).
Upvotes: 1