Reputation: 1323
I have a dataframe. Aggregating the datframe using the pandas groupby:
df.groupby(["Date", "company"]).producttyp.size().reset_index()
Results are displayed in the table below
Date company producttyp
0 Apr-2020 proA1 15
1 Apr-2020 proB1 6
2 Apr-2020 probCC 12
3 Apr-2020 prouba 18
4 Feb-2020 pWC 4
5 Feb-2020 pCA 4
6 Feb-2020 prosu 3
7 Jan-2020 proku 11
8 Jan-2020 proyu 5
9 Jan-2020 prose 2
10 Mar-2020 proud 10
11 Mar-2020 pride 7
12 Mar-2020 proza 6
13 May-2020 proVVaG 40
14 May-2020 proAG 13
15 May-2020 proG 41
16 May-2020 PKWC 18
I will like the DataFrame
to be sorted according to the Date
column. Is there any work around this?
Upvotes: 0
Views: 144
Reputation: 71687
Use, pd.to_datetime
to convert the Date
column to the pandas datetime series and assign this to a temporary column in a grouped df, then use DataFrame.sort_values
to sort this dataframe on this temporary column, finally using DataFrame.drop
drop this temporary column:
df = (
df.assign(temp=pd.to_datetime(df['Date'], format='%b-%Y'))
.sort_values(by='temp', ignore_index=True)
.drop('temp', 1)
)
Result:
# print(df)
Date company producttyp
0 Jan-2020 proyu 5
1 Jan-2020 proku 11
2 Jan-2020 prose 2
3 Feb-2020 pWC 4
4 Feb-2020 pCA 4
5 Feb-2020 prosu 3
6 Mar-2020 proza 6
7 Mar-2020 pride 7
8 Mar-2020 proud 10
9 Apr-2020 proA1 15
10 Apr-2020 prouba 18
11 Apr-2020 probCC 12
12 Apr-2020 proB1 6
13 May-2020 proVVaG 40
14 May-2020 proAG 13
15 May-2020 proG 41
16 May-2020 PKWC 18
Upvotes: 2
Reputation: 7604
Here's another way using pd.to_datetime
to convert it to 01-01-2020
then sort it using df.sort_values
and convert it back to Jan-2020
format using strftime
:
df['Date'] = pd.to_datetime(df['Date'])
df.sort_values(by=['Date'], inplace=True)
df['Date'] = df['Date'].dt.strftime('%b-%Y')
df
Date company producttyp
8 Jan-2020 proyu 5
7 Jan-2020 proku 11
9 Jan-2020 prose 2
4 Feb-2020 pWC 4
5 Feb-2020 pCA 4
6 Feb-2020 prosu 3
12 Mar-2020 proza 6
11 Mar-2020 pride 7
10 Mar-2020 proud 10
0 Apr-2020 proA1 15
3 Apr-2020 prouba 18
2 Apr-2020 probCC 12
1 Apr-2020 proB1 6
13 May-2020 proVVaG 40
14 May-2020 proAG 13
15 May-2020 proG 41
16 May-2020 PKWC 18
Upvotes: 1
Reputation: 7529
Assuming that your Date
column is made of actual datetime
objects, you can simply do:
df.sort_values(by='Date')
output:
Date company producttyp
8 2020-01-01 proyu 5
7 2020-01-01 proku 11
9 2020-01-01 prose 2
4 2020-02-01 pWC 4
5 2020-02-01 pCA 4
6 2020-02-01 prosu 3
12 2020-03-01 proza 6
11 2020-03-01 pride 7
10 2020-03-01 proud 10
0 2020-04-01 proA1 15
3 2020-04-01 prouba 18
2 2020-04-01 probCC 12
1 2020-04-01 proB1 6
13 2020-05-01 proVVaG 40
14 2020-05-01 proAG 13
15 2020-05-01 proG 41
16 2020-05-01 PKWC 18
You may want to call reset_index(drop=True)
afterwards if you don't need to keep the old, unsorted index.
If your Date
column isn't made of datetime
objects, you can convert it with:
df.Date = pd.to_datetime(df.Date)
Upvotes: 1