Jeroen
Jeroen

Reputation: 841

Pandas join grouped dataframe

I have a Pandas dataframe that tells me how many trrips a certain ship has sailed on a given day:

   trip  ship       Date
0     1     1 2020-03-09
1     2     1 2020-03-09
2     3     1 2020-03-09
3     4     1 2020-03-09
4     5     2 2020-03-13
5     6     2 2020-03-13
6     7     2 2020-03-13
7     8     3 2020-03-22
8     9     3 2020-03-22

L= ['year', 'month','day']
df = pd.DataFrame({'year': [2020, 2020,2020,2020,2020,2020, 2020,2020,2020],
                   'month': [3, 3,3,3,3, 3, 3, 3,3 ],
                   'day': [9, 9,9,9,13,13,13,22,22],
                   'trip': [1,2,3,4,5,6,7,8,9],
                   'ship':[1,1,1,1,2,2,2,3,3]})
df['Date'] = pd.to_datetime(df[L],format='%Y%m%d')
df.drop(L,axis=1,inplace=True)

I then count the number of trips within each group 'ship' as:

df2 = df.groupby(['ship'])['trip'].count().reset_index() #counte the number  of trips within each group ship.

How do I assign from the old dataframe df1 the date within that group 'ship' to the new df2?

Desired result look like:

   ship  trip Date 
0     1     4 2020-03-09
1     2     3 2020-03-13
2     3     2 2020-03-22

Upvotes: 1

Views: 49

Answers (2)

Shahir Ansari
Shahir Ansari

Reputation: 1848

You can group by ship,date in case the ships are travelling on different date and you want trips each day.

df2 = df.groupby(['ship','Date']).agg({'trip':'count'})

ship    Date    trip
1   2020-03-09  4
2   2020-03-13  3
3   2020-03-22  2

If you want trips count from the first sale day

df.groupby(['ship']).agg(FirstDate = ('Date','first'),trips = ('trip','count'))

ship FirstDate  trips       
1   2020-03-09  4
2   2020-03-13  3
3   2020-03-22  2

Upvotes: 1

Jaroslav Bezděk
Jaroslav Bezděk

Reputation: 7625

To me, it seems that every ship has got only one Date value. In that case, you can add Date column to .groupby() method, like this:

>>> df2 = df.groupby(['ship', 'Date'])['trip'].count().reset_index()
>>> print(df2)
   ship       Date  trip
0     1 2020-03-09     4
1     2 2020-03-13     3
2     3 2020-03-22     2

Upvotes: 1

Related Questions