doing some operation on date column in postgresql

Question

I have a table with the following columns:

personnel_id INT, date DATE, time TIME WITHOUT TIME ZONE.

Every day each personnel has a clock-in and a clock-out. It's possible to have more than one clock-in and clock-out as well. For example a person may come at 8:00 and exit at 13:00, and again come back at 16:36 and clock-in and finally exit at 19:20.

So for each date I need to calculate the sum of hours a person has been present at work and from that the working hours of that person in each month. Therefore I need a select which gets a personnel_id and returns the working hours of that person in each month. for ex:

ID  1        2      3        4      5        6      7        8   9       10   11    12  
3   173.24   134    147.26   180    50.47    138    196.36   47  93.56   .56  78    139

user330315 · Accepted Answer

You first need to calculate the duration for each clock-in/clock-out combination.

Assuming that you always have a even number of entries for each date for each personnel_id, you can calculate the duration for each pair with something like the following:

select personnel_id, 
       "date", 
       case 
         when row_number() over w % 2 = 0 then "time" - lag("time") over w
       end as duration
from person_work
window w as (partition by personnel_id, "date" order by "time")

row_number() is a window function that assigns a number to each row. lag() is another window function that gets the value of a column from the previous row. As both functions share the same "window definition" I only declared that once with the window clause at the end. The CASE expression calculates the difference of the time column for every second row. clock-in rows have an odd row number, clock-out rows have an even row number. The % 2 checks for even row numbers.

In the next step we need to aggregate the pairs into durations per month. This can be done by building upon the previous query. I am using a common table expression to re-use the previous query:

with hours as (
  select personnel_id, 
         "date", 
         case 
           when row_number() over w % 2 = 0 then 
              -- this converts the interval into a decimal value
              extract(epoch from "time" - lag("time") over w)/3600
         end as hours
  from person_work
  window w as (partition by personnel_id, "date" order by "time")
), hours_per_month as (
  select personnel_id, 
         extract(year from "date")::int as work_year,
         extract(month from "date")::int as work_month,
         sum(hours) work_hours
  from hours
  where hours is not null
  group by personnel_id, work_year, work_month
)
select *
from hours_per_month;

extract(year from ...) returns the year of the date column as decimal value. ::int is a type cast that simply converts that to an integer. Strictly speaking that isn't really necessary though.

extract(epoch from ..) returns the duration of the interval as seconds. Dividing that result by 3600 returns the interval as hours.

This would return something like:

personnel_id | work_year | work_month | work_hours
-------------+-----------+------------+-----------
           1 |      2018 |          1 |      25.33
           1 |      2018 |          2 |      17.08
           1 |      2018 |          3 |       8.25

Then in the final step we need turn the rows into columns. This can be done using conditional aggregation using the filter clause:

with hours as (
  select personnel_id, 
         "date", 
         case 
           when row_number() over w % 2 = 0 then extract(epoch from "time" - lag("time") over w)/3600
         end as hours
  from person_work
  window w as (partition by personnel_id, "date" order by "time")
), hours_per_month as (
  select personnel_id, 
         extract(year from "date")::int as work_year,
         extract(month from "date")::int as work_month,
         sum(hours) hours
  from hours
  where hours is not null
  group by personnel_id, work_year, work_month
)
select personnel_id, 
       work_year,
       sum(hours) filter (where work_month = 1) as hours_jan,
       sum(hours) filter (where work_month = 2) as hours_feb,
       sum(hours) filter (where work_month = 3) as hours_mar,
       sum(hours) filter (where work_month = 4) as hours_apr,
       sum(hours) filter (where work_month = 5) as hours_may,
       sum(hours) filter (where work_month = 6) as hours_jun,
       sum(hours) filter (where work_month = 7) as hours_Jul,
       sum(hours) filter (where work_month = 8) as hours_aug,
       sum(hours) filter (where work_month = 9) as hours_sep,
       sum(hours) filter (where work_month = 10) as hours_oct,
       sum(hours) filter (where work_month = 11) as hours_nov,
       sum(hours) filter (where work_month = 12) as hours_dec
from hours_per_month
group by personnel_id, work_year;

This returns something like this:

personnel_id | work_year | hours_jan | hours_feb | hours_mar | hours_apr | hours_may | hours_jun | hours_jul | hours_aug | hours_sep | hours_oct | hours_nov | hours_dec
-------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+----------
           1 |      2018 |     25.33 |     17.08 |      8.25 |      ...  |    ...    |    ...    |    ...    |    ...    |     ....  |    ....   |     ...   |    ....

If you just want a report for a single year, you can use a where work_year = ... in the final select and remove the column from the select list and the group by

Online example: https://rextester.com/OEEAZ64654

doing some operation on date column in postgresql

Answers (1)

Related Questions