sandy
sandy

Reputation: 123

How to include column which is not a part of group by

How to get max of w_cost by v_id and also the final result set should include av_id.

s_id sg_id  r_cost  w_cost  av_id v_id
123  100    0.50    1.00    1     333
123  105    0.75    0.50    2     333
123  330    2.00    Null    3     888

If w_cost is NULL, r_cost should be taken. The final result should be:

s_id v_id   w_cost  av_id
123  333     1.00   1
123  888     2.00   3

Basic query is

SELECT
t.s_id,
sv.v_id,
sv.w_cost,
CASE
  WHEN sv.w_cost IS NULL THEN
    sv.r_cost::numeric
  ELSE sv.w_cost::numeric
  END AS cost
FROM test t
INNER JOIN stra_ven sv tmad ON
t.s_id = sv.s_id 
GROUP BY t.s_id,sv.v_id,sv.w_cost;

Upvotes: 1

Views: 78

Answers (2)

S-Man
S-Man

Reputation: 23676

Window Functions:

This is what window functions are made for https://www.postgresql.org/docs/current/static/tutorial-window.html

See the db<>fiddle

SELECT 
    s_id, v_id, w_cost, av_id
FROM
    (SELECT 
        s_id,
        v_id,
        av_id,
        COALESCE(w_cost, r_cost) as w_cost,                                    -- A
        MAX(COALESCE(w_cost, r_cost)) OVER (PARTITION BY v_id) as max_w_cost   -- B
     FROM testdata) s
WHERE 
    max_w_cost = w_cost                                                        -- C

A: COALESCE gives the first not NULL value in the list. So if w_cost is NULL, r_cost will be taken.

B: The window function MAX() gives the max value in the partition of v_id. The max function ueses the same COALESCE clause as in (A)

C: The WHERE clause filters the row where max equals the current value of w_cost.

If there are more rows with the same MAX value in my example you get all of them. If you just want one of them then you can add a column to the partition to make the window more precise. Or you can order by something and just take the first one or you take a more or less random one by DISTINCT ON.


DISTINCT ON:

With DISTINCT ON you can filter the distinct row for special columns (whereas the normal DISTINCT looks at all columns). Because a result set without any ORDER BY clause can be very random, it should be sorted by v_id and the final cost (greatest first (DESC); calculated with the COALESCE function as stated above). Then the DISTINCT takes the first row.

db<>fiddle

SELECT DISTINCT ON (v_id)                  -- C
    s_id, v_id, cost as w_cost, av_id
FROM
    (SELECT 
        s_id,
        v_id,
        av_id,
        COALESCE(w_cost, r_cost) as cost   -- A
     FROM testdata
     ORDER BY v_id, cost DESC) s           -- B

A: COALESCE as mentioned in the window function section.

B: Ordering to get the wanted row first.

C: DISTINCT ON filters for every distinct v_id the first row.

Upvotes: 1

Thiago Procaci
Thiago Procaci

Reputation: 1523

The SQL below might work:

with my_table as (
select 
   123 as s_id,
   100 as sg_id,
   0.50 as r_cost,
   1.00 as  w_cost,
   1 as av_id,
   333 as v_id
union all
   select 
    123 as s_id,
    105 as sg_id,
    0.75 as r_cost,
    0 as  w_cost,
    2 as av_id,
    333 as v_id
union all 
select 
   123 as s_id,
   330 as sg_id,
   1.00 as r_cost,
   Null as  w_cost,
   3 as av_id,
   888 as v_id
),
w_r_cost_table as (
select t.*,
        case 
            when t.w_cost is not null then t.w_cost
            else t.r_cost
        end as w_r_cost
        from my_table t
),
grouped_table as (
    select  
      A.v_id, 
      max(A.w_r_cost) as w_cost  
from w_r_cost_table A
group by A.v_id
)
select 
(select t.s_id from w_r_cost_table t where t.w_r_cost = g.w_cost and t.v_id = g.v_id) as s_id,
g.v_id,
g.w_cost,
(select t.av_id from w_r_cost_table t where t.w_r_cost = g.w_cost and t.v_id = g.v_id) as av_id

from grouped_table g

Let's assume that the temp table my_table is your source table. First, we should define the corrected values for the column w_cost. As in your question, we should not consider null values for w_cost.

To do this, I created a temp table called w_r_cost_table. Using "case when" we can make an if/else clause and put the corrected value in a new column called w_r_cost.

Then, using the temp table w_r_cost_table, we can perform a group by using the field w_r_cost, as done in grouped_table.

Finally, just make a select in grouped_table to have your results.

I used Postgres in this example.

Upvotes: 0

Related Questions