Wendy D.
Wendy D.

Reputation: 109

Iterate over rows in a data frame create a new column then adding more columns based on the new column

I have a data frame as below:

Date           Quantity
2019-04-25     100
2019-04-26     148
2019-04-27     124

The output that I need is to take the quantity difference between two next dates and average over 24 hours and create 23 columns with hourly quantity difference added to the column before such as below:

Date           Quantity Hour-1   Hour-2 ....Hour-23
2019-04-25     100      102      104   .... 146
2019-04-26     148      147      146   .... 123
2019-04-27     124

I'm trying to iterate over a loop but it's not working ,my code is as below:

for i in df.index:
    diff=(df.get_value(i+1,'Quantity')-df.get_value(i,'Quantity'))/24
    for j in range(24):
    df[i,[1+j]]=df.[i,[j]]*(1+diff)

I did some research but I have not found how to create columns like above iteratively. I hope you could help me. Thank you in advance.

Upvotes: 1

Views: 717

Answers (2)

BENY
BENY

Reputation: 323226

IIUC using resample and interpolate, then we pivot the output

s=df.set_index('Date').resample('1 H').interpolate()
s=pd.pivot_table(s,index=s.index.date,columns=s.groupby(s.index.date).cumcount(),values=s,aggfunc='mean')
s.columns=s.columns.droplevel(0)
s
Out[93]: 
               0      1      2      3   ...       20     21     22     23
2019-04-25  100.0  102.0  104.0  106.0  ...    140.0  142.0  144.0  146.0
2019-04-26  148.0  147.0  146.0  145.0  ...    128.0  127.0  126.0  125.0
2019-04-27  124.0    NaN    NaN    NaN  ...      NaN    NaN    NaN    NaN
[3 rows x 24 columns]

Upvotes: 2

Sid
Sid

Reputation: 4055

If I have understood the question correctly.

for loop approach:

list_of_values = []
for i,row in df.iterrows():
    if i < len(df) - 2:
         qty = row['Quantity']
         qty_2 = df.at[i+1,'Quantity']
         diff = (qty_2 - qty)/24
         list_of_values.append(diff)
   else:
         list_of_values.append(0)

df['diff'] = list_of_values

Output:

Date           Quantity   diff
2019-04-25     100        2
2019-04-26     148        -1
2019-04-27     124        0

Now create the columns required.

i.e.

df['Hour-1'] = df['Quantity'] + df['diff']
df['Hour-2'] = df['Quantity'] + 2*df['diff']
.
.
.
.

There are other approaches which will work way better.

Upvotes: 0

Related Questions