Shift values in pandas dataframe vertically

Question

I have a pandas dataframe that looks like this:

import pandas as pd
d = {'value': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20], 'user': ['A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C',]}
df = pd.DataFrame(data=d)
df

    value   user
    1       A
    2       A
    3       A
    4       A
    5       A
    6       A
    7       A
    8       A
    9       B
    10      B
    11      B
    12      B
    13      C
    14      C
    15      C
    16      C
    17      C
    18      C
    19      C
    20      C

I would like to create a new column called value2 to this dataframe, populate it with the same values from the value column, but with the rows shifted upwards by one row. The last row of each user group should be removed too. In the example below you can see that e.g. value pair 8-9 is missing.

    value  value2      user
    1      2           A
    2      3           A
    3      4           A
    4      5           A
    5      6           A
    6      7           A
    7      8           A
    9      10          B
    10     11          B
    11     12          B
    13     14          C
    14     15          C
    15     16          C
    16     17          C
    17     18          C
    18     19          C
    19     20          C

I know can shift the rows with pandas' shift() function, but it still doesn't give me the desired output, as you can see below. The solution seems to be to remove the last row of per user group.. and to change the order of the columns (the user column should be the third column). Any ideas how to improve this?

df['value2'] = df['value'].shift(-1, fill_value=0)
df


    value   user    value2
    1       A       2
    2       A       3
    3       A       4
    4       A       5
    5       A       6
    6       A       7
    7       A       8
    8       A       9
    9       B       10
    10      B       11
    11      B       12
    12      B       13
    13      C       14
    14      C       15
    15      C       16
    16      C       17
    17      C       18
    18      C       19
    19      C       20
    20      C       0

jezrael · Accepted Answer

Use DataFrameGroupBy.shift and remove -1 rows:

df['next_node'] = df.groupby('user')['node'].shift(-1, fill_value=-1)

df = df[df['next_node'].ne(-1)]
print (df)
    node user  next_node
0      1    A          2
1      2    A          3
2      3    A          4
3      4    A          5
4      5    A          6
5      6    A          7
6      7    A          8
8      9    B         10
9     10    B         11
10    11    B         12
12    13    C         14
13    14    C         15
14    15    C         16
15    16    C         17
16    17    C         18
17    18    C         19
18    19    C         20

Or remove last rows by user in Series.duplicated:

df['next_node'] = df.groupby('user')['node'].shift(-1, fill_value=-1)

df = df[df['user'].duplicated(keep='last')]

Shift values in pandas dataframe vertically

Answers (1)

Related Questions