Reputation: 133

Transposing/ reshaping data in python

I have a dataset in this form

Agent ID    Month   values
101         Jan-17  2
101         Feb-17  4
101         Mar-17  3
101         Apr-17  8
101         May-17  12
101         Jun-17  3
101         Dec-17  1
102         Jan-17  2
102         Feb-17  3
102         Mar-17  7
102         Apr-17  3
102         May-17  2
102         Jun-17  11
102         Sep-17  2
102         Oct-17  2
102         Nov-17  1
102         Dec-17  4

I want it to come to this shape

Agent ID    Month   values  Jan-17  Feb-17  Mar-17  Apr-17  May-17  Jun-17  Sep-17  Oct-17  Nov-17  Dec-17
101 Jan-17  2   2   4   3   8   12  3   0   0   0   1
101 Feb-17  4   2   4   3   8   12  3   0   0   0   1
101 Mar-17  3   2   4   3   8   12  3   0   0   0   1
101 Apr-17  8   2   4   3   8   12  3   0   0   0   1
101 May-17  12  2   4   3   8   12  3   0   0   0   1
101 Jun-17  3   2   4   3   8   12  3   0   0   0   1
101 Dec-17  1   2   4   3   8   12  3   0   0   0   1
102 Jan-17  2   2   3   7   3   2   11  2   2   1   4
102 Feb-17  3   2   3   7   3   2   11  2   2   1   4
102 Mar-17  7   2   3   7   3   2   11  2   2   1   4
102 Apr-17  3   2   3   7   3   2   11  2   2   1   4
102 May-17  2   2   3   7   3   2   11  2   2   1   4
102 Jun-17  11  2   3   7   3   2   11  2   2   1   4
102 Sep-17  2   2   3   7   3   2   11  2   2   1   4
102 Oct-17  2   2   3   7   3   2   11  2   2   1   4
102 Nov-17  1   2   3   7   3   2   11  2   2   1   4
102 Dec-17  4   2   3   7   3   2   11  2   2   1   4

Upvotes: 2

Answers (2)

Andy L.

Reputation: 25239

It is also doable with pd.crosstab and using apply to ffill and bfill on groupby.
I used the line from WenYoBen to convert df.Month to datime format to keep order properly as OP wants:

df.Month=pd.to_datetime(df.Month,format='%b-%y').dt.strftime('%Y-%m')
df1 = pd.crosstab(index=[df.AgentID, df.Month, df['values']], columns=df.Month, values=df['values'], aggfunc='first')
df1 = df1.groupby(level=0).apply(lambda x: x.ffill().bfill()).fillna(0).reset_index()


Out[2103]:
Month  AgentID    Month  values  2017-01  2017-02  2017-03  2017-04  2017-05  \
0          101  2017-01       2      2.0      4.0      3.0      8.0     12.0
1          101  2017-02       4      2.0      4.0      3.0      8.0     12.0
2          101  2017-03       3      2.0      4.0      3.0      8.0     12.0
3          101  2017-04       8      2.0      4.0      3.0      8.0     12.0
4          101  2017-05      12      2.0      4.0      3.0      8.0     12.0
5          101  2017-06       3      2.0      4.0      3.0      8.0     12.0
6          101  2017-12       1      2.0      4.0      3.0      8.0     12.0
7          102  2017-01       2      2.0      3.0      7.0      3.0      2.0
8          102  2017-02       3      2.0      3.0      7.0      3.0      2.0
9          102  2017-03       7      2.0      3.0      7.0      3.0      2.0
10         102  2017-04       3      2.0      3.0      7.0      3.0      2.0
11         102  2017-05       2      2.0      3.0      7.0      3.0      2.0
12         102  2017-06      11      2.0      3.0      7.0      3.0      2.0
13         102  2017-09       2      2.0      3.0      7.0      3.0      2.0
14         102  2017-10       2      2.0      3.0      7.0      3.0      2.0
15         102  2017-11       1      2.0      3.0      7.0      3.0      2.0
16         102  2017-12       4      2.0      3.0      7.0      3.0      2.0

Month  2017-06  2017-09  2017-10  2017-11  2017-12
0          3.0      0.0      0.0      0.0      1.0
1          3.0      0.0      0.0      0.0      1.0
2          3.0      0.0      0.0      0.0      1.0
3          3.0      0.0      0.0      0.0      1.0
4          3.0      0.0      0.0      0.0      1.0
5          3.0      0.0      0.0      0.0      1.0
6          3.0      0.0      0.0      0.0      1.0
7         11.0      2.0      2.0      1.0      4.0
8         11.0      2.0      2.0      1.0      4.0
9         11.0      2.0      2.0      1.0      4.0
10        11.0      2.0      2.0      1.0      4.0
11        11.0      2.0      2.0      1.0      4.0
12        11.0      2.0      2.0      1.0      4.0
13        11.0      2.0      2.0      1.0      4.0
14        11.0      2.0      2.0      1.0      4.0
15        11.0      2.0      2.0      1.0      4.0
16        11.0      2.0      2.0      1.0      4.0

Upvotes: 0

BENY

Reputation: 323236

I think that is pivot first then merge

df.Month=pd.to_datetime(df.Month,format='%b-%y').dt.strftime('%Y-%m')
s=df.pivot(*df.columns).fillna(0).reset_index()
df=df.merge(s)
df
Out[876]: 
    AgentID    Month  values   ...     2017-10  2017-11  2017-12
0       101  2017-01       2   ...         0.0      0.0      1.0
1       101  2017-02       4   ...         0.0      0.0      1.0
2       101  2017-03       3   ...         0.0      0.0      1.0
3       101  2017-04       8   ...         0.0      0.0      1.0
4       101  2017-05      12   ...         0.0      0.0      1.0
5       101  2017-06       3   ...         0.0      0.0      1.0
6       101  2017-12       1   ...         0.0      0.0      1.0
7       102  2017-01       2   ...         2.0      1.0      4.0
8       102  2017-02       3   ...         2.0      1.0      4.0
9       102  2017-03       7   ...         2.0      1.0      4.0
10      102  2017-04       3   ...         2.0      1.0      4.0
11      102  2017-05       2   ...         2.0      1.0      4.0
12      102  2017-06      11   ...         2.0      1.0      4.0
13      102  2017-09       2   ...         2.0      1.0      4.0
14      102  2017-10       2   ...         2.0      1.0      4.0
15      102  2017-11       1   ...         2.0      1.0      4.0
16      102  2017-12       4   ...         2.0      1.0      4.0
[17 rows x 13 columns]

More Info

s
Out[878]: 
Month  AgentID  2017-01  2017-02   ...     2017-10  2017-11  2017-12
0          101      2.0      4.0   ...         0.0      0.0      1.0
1          102      2.0      3.0   ...         2.0      1.0      4.0
[2 rows x 11 columns]

Upvotes: 5

Transposing/ reshaping data in python

Answers (2)

Related Questions