Jessica Yoskovich
Jessica Yoskovich

Reputation: 41

Calculating average by using the previous row's value and following row's value

I have calculated average values for each month. Some months are NULL and my manager wants me to use the previous row's value and following month's value and fill the months which are having NULL values.

Current result (see below pic):

Expected Result

DECLARE @DATE DATE = '2017-01-01';
WITH DATEDIM AS
(             
  SELECT DISTINCT DTM.FirstDayOfMonth 
  FROM DATEDIM DTM 
  WHERE Date >= '01/01/2017'
  AND Date <= DATEADD(mm,-1,Getdate())
), 
Tab1 AS
(
  SELECT 
    T1.FirstDayOfMonth AS MONTH_START,
    AVG1, 
    ROW_NUMBER() OVER (
      ORDER BY DATEADD(MM,DATEDIFF(MM, 0, T1.FirstDayOfMonth),0) DESC
    ) AS RNK 
  FROM DATEDIM T1 
  LEFT OUTER JOIN (
    SELECT 
      DATEADD(MM,DATEDIFF(MM, 0, StartDate),0) MONTH_START, 
      AVG(CAST(DATEDIFF(dd, StartDate, EndDate) AS FLOAT)) AS AVG1
    FROM DATATable
    WHERE EndDate >= StartDate
    AND StartDate >= @DATE
    AND EndDate >= @DATE
    GROUP BY DATEADD(MM,DATEDIFF(MM, 0, StartDate),0)
  ) T2 ON T1.FirstDayOfMonth = T2.MONTH_START
)
SELECT * 
FROM Tab1

Upvotes: 1

Views: 1147

Answers (4)

Serg
Serg

Reputation: 22811

Using your CTEs

select MONTH_START,
    case when AVG1 is null then
       (select top(1) t2.AVG1 
        from Tab1 t2 
        where t1.RNK > t2.RNK and t2.AVG1 is not null
        order by t2.RNK desc)
    else AVG1 end AVG1,
    RNK 
from Tab1 t1

Edit

Version for an average of nearest peceding and nearest following non-nulls. Both must exist otherwise NULL is returned.

select MONTH_START,
    case when AVG1 is null then
     ( (select top(1) t2.AVG1 
        from Tab1 t2 
        where t1.RNK > t2.RNK and t2.AVG1 is not null
        order by t2.RNK desc)
       +(select top(1) t2.AVG1 
        from Tab1 t2 
        where t1.RNK < t2.RNK and t2.AVG1 is not null
        order by t2.RNK)
      ) / 2
    else AVG1 end AVG1,
    RNK 
from Tab1 t1

Upvotes: 1

Michał Turczyn
Michał Turczyn

Reputation: 37367

You could try this query (I just reflected in my sample data relevant parts, I omitted date column):

declare @tbl table (rank int, value int);
insert into @tbl values
(1, null),
(2, 20),
(3, 30),
(4, null),
(5, null),
(6, null),
(7, 40),
(8, null),
(9, null),
(10, 36),
(11, 22);

;with cte as (
    select *,
           DENSE_RANK() over (order by case when value is null then rank else value end) drank,
           case when value is null then lag(value) over (order by rank) end lag,
           case when value is null then lead(value) over (order by rank) end lead
    from @tbl
)

select rank, value, case when value is null then  
         max(lag) over (partition by grp) / 2 +
         max(lead) over (partition by grp) / 2
       else value end valueWithAvg
from (
    select *,
           rank - drank grp from cte
) a order by rank

Upvotes: 0

ravioli
ravioli

Reputation: 3833

You can manipulate previous and following row values using window functions:

SELECT MAX(row_value) OVER(
  ORDER BY ... ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS Previous_Value,
MAX(row_value) OVER(
  ORDER BY ... ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING) AS Next_Value

Alternatively you can use LAG/LEAD functions and modify your sub-query where you get the AVG:

SELECT 
  src.MONTH_START, 
  CASE 
    WHEN src.prev_val IS NULL OR src.next_val IS NULL 
      THEN COALESCE(src.prev_val, src.next_val) -- Return non-NULL value (if exists)
    ELSE (src.prev_val + src.next_val ) / 2
  END AS AVG_new
FROM (
  SELECT 
    DATEADD(MM,DATEDIFF(MM, 0, StartDate),0) MONTH_START, 
    LEAD(CAST(DATEDIFF(dd, StartDate, EndDate) AS FLOAT)) OVER(ORDER BY ...) AS prev_val,
    LAG(CAST(DATEDIFF(dd, StartDate, EndDate) AS FLOAT)) OVER(ORDER BY ...) AS next_val
  -- AVG(CAST(DATEDIFF(dd, StartDate, EndDate) AS FLOAT)) AS AVG1
  FROM DATATable
  WHERE EndDate >= StartDate
  AND StartDate >= @DATE
  AND EndDate >= @DATE
  GROUP BY DATEADD(MM,DATEDIFF(MM, 0, StartDate),0)
) AS src

I haven't tested it, but give it a shot and see how it works. You may need to put at least one column in the ORDER BY portion of the window function.

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269803

I can't quite tell what you are trying to calculate the average of, but this is quite simple with window functions:

select t.*,
       avg(val) over (order by month_start rows between 1 preceding and 1 rollowing)
from t;

In your case, I think this translates as:

select datefromparts(year(startdate), month(startdate), 1) as float,
       avg(val) as monthaverage,
       avg(avg(val)) over (order by min(startdate) rows between 1 preceding and 1 following)
from datatable d
where . . .
group by datefromparts(year(startdate), month(startdate), 1)

Upvotes: 0

Related Questions