FlyingPickle
FlyingPickle

Reputation: 1133

Python create a data frame by using last n rows

I have a pandas df as follows:

Value1    Value2    Label
15.1      12         0
17        5          1
19        2          1

I am looking to build a new df, such that each row contains thee input of the previous n rows. For example if n=2 my output should be

Value1.1  Value2.1    Value1.2    Value2.2    Value1    Value2   Label
15.1       12          17           5          19        2         1

This is the third row and has a label=1, the Value1 and Value2 of the previous 2 rows are appended to the third row. Any thoughts on how I can achieve this in python? Thanks!

Upvotes: 0

Views: 52

Answers (2)

Celius Stingher
Celius Stingher

Reputation: 18367

Maybe this helps you get the job done, in this case I'm defining the variable n as the amount of rows you want to use:

import pandas as pd

n = 2
aux = sorted(list(range(1,n+1))*n)
df = pd.DataFrame({'Value1':[15.1,17,19],
                   'Value2':[12,5,2],
                   'Label':[0,1,1]})
cols = [(list(df)[:n]*n)[x]+"."+str(aux[x]) for x in range(len(aux))]
df_out = pd.DataFrame({'vals':df.iloc[:n,:-1].values.flatten().tolist()+df.iloc[n,:].values.tolist()},
                       index = cols + list(df)).T

The output is as expected:

      Value1.1  Value2.1  Value1.2  Value2.2  Value1  Value2  Label
vals      15.1      12.0      17.0       5.0    19.0     2.0    1.0

Upvotes: 1

Pierre D
Pierre D

Reputation: 26221

Perhaps something like:

n = 2
sel = [k for k in df.columns if k != 'Label']
df2 = df
for k in range(1, n + 1):
    df2 = df2.join(df[sel].shift(k), rsuffix=f'.{k}')

print(df2)
   Value1  Value2  Label  Value1.1  Value2.1  Value1.2  Value2.2
0    15.1      12      0       NaN       NaN       NaN       NaN
1    17.0       5      1      15.1      12.0       NaN       NaN
2    19.0       2      1      17.0       5.0      15.1      12.0

Or, if you prefer the column order you indicated in your example:

df2 = df
for k in range(1, n+1):
    df2 = df[sel].shift(k).join(df2, lsuffix=f'.{k}')

print(df2)
   Value1.2  Value2.2  Value1.1  Value2.1  Value1  Value2  Label
0       NaN       NaN       NaN       NaN    15.1      12      0
1       NaN       NaN      15.1      12.0    17.0       5      1
2      15.1      12.0      17.0       5.0    19.0       2      1

Upvotes: 1

Related Questions