Reputation: 23
The script scrapes prices, addresses, suburbs and postcodes of houses and then writes them to a csv file.
The csv file is imported into panda (only postcode and price) and groupby the mean price of the postcode. This groupby list is written to a json.
The csv file looks like this in excel
________________________
|Postcode|Price |
________________________
|5061 | 205000 |
________________________
|5063 | 930000 |
________________________
The code looks like this
import pandas as pd
from pandas import DataFrame
df = pd.read_csv('House_Prices.csv', usecols=[ 'Postcode' , ' Price' ], index_col=False)
grouped = df.groupby(['Postcode']).mean()
grouped.to_json('average_house_price.json')
The code above outputs the json file as
{" Price":{"5061":2025000.0,"5063":930000.0}}
I want the json file to be outputted like this
{"5061":2025000.0,"5063":930000.0}
Is there a way with the panda library (or other) to remove the starting Price index?
Upvotes: 2
Views: 42
Reputation:
Try passing header=None
Look here:
df = pd.read_csv('t.csv', usecols=[ 'Postcode' , ' Price' ], index_col=False, header=None)
Upvotes: 0
Reputation: 409
Why don't take the field before saving as JSON, like this:
grouped[" Price"].to_json('average_house_price.json')
Upvotes: 0
Reputation: 863166
Add column name for aggregate "Price"
or " Price"
for Series
:
grouped = df.groupby(['Postcode'])["Price"].mean()
#grouped = df.groupby(['Postcode'])[" Price"].mean()
grouped.to_json('average_house_price.json')
Upvotes: 1