Reputation: 53
I have a dataset where each row is a salary change, and I am trying to get the last salary (that is the highest), with all the details, as the reason (if the salary has changed) and the date of last change, and as the last column I want the value of the salary before. How can I get it?
I've done with the last salary and the first salary using MIN(), but I want the salary before, not the first. The result that I am getting:
company_id name last_salary_change reason updated_at salary_before
29 Fulano 5000 promotion 2019-05-20 1200
29 Ramon 25000 adjustment 2019-03-23 11500
The query that I've used:
SELECT p.company_id,
u.name AS name,
MAX(psc.amount/100) AS last_salary,
MAX(psc.reason) AS reason,
MAX(psc.updated_at) AS updated,
MIN(psc.amount/100) AS first_salary
FROM lukla.profiles AS p
INNER JOIN lukla.profile_salary_changes AS psc
ON p.id = psc.profile_id
INNER JOIN lukla.users AS u
ON p.id = u.profile_id
WHERE p.company_id = 29 (filtered by a specific company)
GROUP BY 1, 2
What I am looking for:
company_id name last_salary_change reason updated_at salary_before
29 Fulano 5000 promotion 2019-05-20 3500
29 Ramon 25000 adjustment 2019-03-23 24000
Upvotes: 3
Views: 142
Reputation: 1
Assuming your database structure looks like this: https://www.db-fiddle.com/f/i2BNYKhSaiu1xGPDPfeydr/1
You can run this query:
SELECT p.company_id,
p.id,
last_salary.amount/100 AS last_salary,
last_salary.reason AS reason,
last_salary.updated_at AS updated,
prev_salary.updated_at as salary_before_updated_at,
prev_salary.amount/100 AS salary_before
FROM profiles AS p
LEFT JOIN profile_salary_changes AS last_salary
ON p.id = last_salary.profile_id
AND NOT EXISTS (
SELECT * FROM profile_salary_changes psc2
WHERE psc2.profile_id = last_salary.profile_id
AND psc2.updated_at > last_salary.updated_at)
LEFT JOIN profile_salary_changes AS prev_salary
ON p.id = prev_salary.profile_id
AND prev_salary.updated_at != last_salary.updated_at
AND NOT EXISTS (
SELECT * FROM profile_salary_changes psc3
WHERE psc3.profile_id = prev_salary.profile_id
AND psc3.updated_at < last_salary.updated_at
AND psc3.updated_at > prev_salary.updated_at)
WHERE p.company_id = 29;
If you have a primary field like id
in your profile_salary_changes
table, it's better to use it instead of updated_at
in comparisons
Upvotes: 0
Reputation: 14861
IMHO There are major issues with the entire concept. They stem from the assumptions that the first salary is always the lowest, and the last is the high test, and that salaries are always upward bound. None of these are true. The assumption that MAX can be used to get the latest also leads to a structure that would be rare at best. The section:
MAX(psc.amount/100) AS last_salary,
MAX(psc.reason) AS reason,
MAX(psc.updated_at) AS updated,
Even admitting that MAX psc.amount would get the correct value, we can assume the date is also selected correctly. However, the same does not apply for reason, care must always be taken getting the max of a text in combination with another column. There is no requirement that max text corresponds to max amount.
This was an interesting problem to play with so the following resolves these issues. Take it or ignore it as you see fit. Either way I found it interesting.
--- create "tables" as cte
with profiles (id, company_id) as
( values (61,29)
, (62,29)
, (63,29)
, (64,29)
)
, profile_salary_changes (profile_id, amount, reason, updated_at) as
( values (64, 640000, 'Initial Hire', '2018-02-15'::date)
, (64, 611200, 'Salary cut 4.5% across the board Co wide.', '2018-07-05'::date)
, (64, 625600, '50% July''s cut recovery', '2018-12-09'::date)
, (64, 710000, 'Promotion', '2019-02-09'::date)
, (63, 630000, 'Initial ', '2019-02-15'::date)
, (63, 600000, 'Transfer.', '2019-07-05'::date)
, (63, 627500, 'COL', '2019-12-09'::date)
, (61, 100000, 'Initial Only', '2019-05-09'::date)
, (62, 620000, 'First', '2019-03-09'::date)
, (62, 625000, 'Bonus', '2019-08-09'::date)
)
, users (profile_id, name) as
( values (61, 'Test1')
, (62, 'Test2')
, (63, 'Test3')
, (64, 'Test4')
)
-- final selection pick up designated columns
select company_id
, name
, reason
, trunc(amount/100.0,2) last_salary
, updated_at
, trunc(prev_sal/100.0,2) prev_salary
, trunc(first_salary/100.0,2) first_salary
from ( -- pick up the appropriate previous and first salary
select x.*
, lag (amount) over( partition by company_id, profile_id order by updated_at ) prev_sal
, lag (amount, (rn-1)::integer) over( partition by company_id, profile_id order by updated_at ) first_salary
from (
-- gather all columns, number each row for company and profile, and get number of total rows for each set
select p.company_id
, u.name
, psc.*
, row_number() over( partition by p.company_id, psc.profile_id order by updated_at) rn
, count(*) over( partition by p.company_id, psc.profile_id) r
from profile_salary_changes psc
join profiles p on (p.id = psc.profile_id)
join users u on (p.id = u.profile_id)
) x
) z
-- select only the last row in each set. Additional salary values have been attached
where r = rn
order by name;
Upvotes: 0
Reputation: 53
The answer is based on the @Gordon's answer, the only diff is that I've added
JOIN lukla.profiles p
ON p.id = psc.profile_id
JOIN lukla.users u
ON p.id = u.profile_id
inside the first JOIN.
SELECT p.company_id,
u.name AS name,
MAX(psc.amount/100) AS last_salary,
MAX(psc.reason) AS reason,
MAX(psc.updated_at) AS updated,
MIN(psc.amount/100) AS first_salary,
MAX(amount / 100) FILTER (WHERE seqnum = 2) as prev_salary
FROM lukla.profiles p
JOIN
(SELECT psc.*,
ROW_NUMBER() OVER (PARTITION BY p.company_id, u.name ORDER BY psc.updated_at) as seqnum
FROM lukla.profile_salary_changes psc
JOIN lukla.profiles p
ON p.id = psc.profile_id
JOIN lukla.users u
ON p.id = u.profile_id
) psc
ON p.id = psc.profile_id
JOIN lukla.users u
ON p.id = u.profile_id
WHERE p.company_id = 29
GROUP BY 1, 2;
Upvotes: 1
Reputation: 1269623
You can use window functions with conditional aggregation:
SELECT p.company_id,
u.name AS name,
MAX(psc.amount/100) AS last_salary,
MAX(psc.reason) AS reason,
MAX(psc.updated_at) AS updated,
MIN(psc.amount/100) AS first_salary,
MAX(amount / 100) FILTER (WHERE seqnum = 2) as prev_salary
FROM lukla.profiles p INNER JOIN
(SELECT ps.*,
ROW_NUMBER() OVER (PARTITION BY p.company_id, u.name ORDER BY psc.updated_at) as seqnum
FROM lukla.profile_salary_changes psc
) psc
ON p.id = psc.profile_id INNER JOIN
lukla.users u
ON p.id = u.profile_id
WHERE p.company_id = 29 (filtered by a specific company)
GROUP BY 1, 2;
Two comments:
JOIN
s are not bringing in new rows.Upvotes: 0
Reputation: 13509
It seems you need second maximum salary for that column. You can try below query -
SELECT p.company_id,
u.name AS name,
MAX(psc.amount/100) AS last_salary,
MAX(psc.reason) AS reason,
MAX(psc.updated_at) AS updated,
(SELECT MAX(psc.amount/100)
FROM lukla.profile_salary_changes psc2
WHERE psc2.profile_id = profile_salary_changes.profile_id
AND (psc.amount/100) < (SELECT MAX(psc.amount/100)
FROM lukla.profile_salary_changes psc2
WHERE psc2.profile_id = profile_salary_changes.profile_id)) salary_before
FROM lukla.profiles AS p
INNER JOIN lukla.profile_salary_changes AS psc
ON p.id = psc.profile_id
INNER JOIN lukla.users AS u
ON p.id = u.profile_id
WHERE p.company_id = 29 (filtered by a specific company)
GROUP BY 1, 2
Though I have not tested it but it should work.
Upvotes: 0