Reputation: 45
I have the following dataset...
https://drive.google.com/open?id=1PpI6PlO8ox2vKfM4aGmEUexCPPWa59S_
How do I create a new DataFrame where the first column is the unique neighbourhood names and the next two columns are filled with the minimum and maximum price for each neighbourhood?
Upvotes: 0
Views: 1325
Reputation: 19733
you can group by the hosts and get max and min value like below:
>>> data = pd.read_csv('dumbbell_data.csv')
>>> pd.DataFrame([[value.maxPrice.max(), value.minPrice.min(), key] for key, value in data.groupby('neighbourhood_hosts')], columns=data.columns)
maxPrice minPrice neighbourhood_hosts
0 45 45 Bloomsbury
1 247 117 Canonbury
2 112 72 Chelsea
3 160 29 LB of Barnet
4 80 72 LB of Hackney
5 321 58 Shepherd's Bush
6 72 55 Shoreditch
7 115 72 Wapping
8 120 58 Wimbledon
A more Pythonic way is to use agg
function:
>>> data.sort_values(by='neighbourhood_hosts').groupby('neighbourhood_hosts').agg({'maxPrice': max, 'minPrice':min})
maxPrice minPrice
neighbourhood_hosts
Bloomsbury 45 45
Canonbury 247 117
Chelsea 112 72
LB of Barnet 160 29
LB of Hackney 80 72
Shepherd's Bush 321 58
Shoreditch 72 55
Wapping 115 72
Wimbledon 120 58
Upvotes: 1