Reputation: 10909
I have a dataframe with a data about train stations in a month, of which three are indexes: Station, Date, Hour. I could look like this:
Station Date Hour Passengers
Berlin HBF 2012-12-24 12:00 1000
Berlin HBF 2012-12-24 13:00 2000
Berlin HBF 2012-12-24 14:00 1000
Berlin HBF 2012-12-24 15:00 1000
....
Stuttgart 2012-12-24 12:00 500
Since I am only interested in sums for a station in a month, I would like to groupby by Station, Date, and Hour, so that the end result looks like this:
Station Passengers
Berlin HBF 4000
....
Stuttgart 500
But I am unable to get pandas to this solution, I tried: byStation = traindata.groupby(['Station', 'Date', 'Hour']).agg(np.sum()) But that simply returns a multiindex, with all rows...
Upvotes: 2
Views: 1154
Reputation: 9709
Looks like you want to group by "Station" only and do a sum over the "Passangers"-column. You do not need a multi-index here. Your solution will create one, but as it is the same one as your raw data, it's quite useless.
This one should work:
traindata.groupby("Station").Passengers.sum()
Upvotes: 2