Hristo Stoychev
Hristo Stoychev

Reputation: 535

Pandas, create new columns based on existing with repeated count

It's a bit complicated for explain, so I'll do my best. I have a pandas with two columns: hour (from 1 to 24) and value(corresponding to each hour). Dataset index is huge but column hour is repeated on that 24 hours basis (from 1 to 24). I am trying to create new 24 columns: value -1, value -2, value -3...value -24 that will correspond to each row and value from -1 hour, value from -2 hours(from above rows).

hour | value | value -1 | value -2 | value -3| ... | value - 24
1       10       0          0           0               0
2       11       10         0           0               0
3       12       11         10          0               0
4       13       12         11          10              0
...
24      32       31         30          29              0
1       33       32         31          30              10
2       34       33         32          31              11
and so on...

All value numbers are for the example. As I said there are lots of rows, not only 24 for all hours in a day time but all following time series from 1 to 24 and etc. Thanks in advance and may the force be with you!

Upvotes: 1

Views: 479

Answers (2)

Deena
Deena

Reputation: 6213

Is this what you are looking for?

import pandas as pd
df = pd.DataFrame({'hour': list(range(24))*2, 
                   'value': list(range(48))})

shift_cols_n = 10 
for shift in range(1, shift_cols_n): 
    new_columns_name = 'value - ' + str(shift)

    # Assuming that you don't have any NAs in your dataframe
    df[new_columns_name] = df['value'].shift(shift).fillna(0)

    # A safer (and a less simple) way, in case you have NAs in your dataframe
    df[new_columns_name] = df['value'].shift(shift)
    df.loc[:shift, new_columns_name] = 0

print(df.head(9))

   hour  value  value - 1  value - 2  value - 3  value - 4  value - 5  \
0     0      0        0.0        0.0        0.0        0.0        0.0   
1     1      1        0.0        0.0        0.0        0.0        0.0   
2     2      2        1.0        0.0        0.0        0.0        0.0   
3     3      3        2.0        1.0        0.0        0.0        0.0   
4     4      4        3.0        2.0        1.0        0.0        0.0   
5     5      5        4.0        3.0        2.0        1.0        0.0   
6     6      6        5.0        4.0        3.0        2.0        1.0   
7     7      7        6.0        5.0        4.0        3.0        2.0   
8     8      8        7.0        6.0        5.0        4.0        3.0   

   value - 6  value - 7  value - 8  value - 9  
0        0.0        0.0        0.0        0.0  
1        0.0        0.0        0.0        0.0  
2        0.0        0.0        0.0        0.0  
3        0.0        0.0        0.0        0.0  
4        0.0        0.0        0.0        0.0  
5        0.0        0.0        0.0        0.0  
6        0.0        0.0        0.0        0.0  
7        1.0        0.0        0.0        0.0  
8        2.0        1.0        0.0        0.0  

Upvotes: 0

Kacper Wolkowski
Kacper Wolkowski

Reputation: 1597

Is this what you need?

df = pd.DataFrame([[1,10],[2,11],
             [3,12],[4,13]], columns=['hour','value'])

for i in range(1, 24):
    df['value -' + str(i)] = df['value'].shift(i).fillna(0)

result:

enter image description here

Upvotes: 3

Related Questions