Reputation: 628
I have a dataframe where one column has a list of zipcodes and the other has property values corresponding to the zipcode. I want to sum up the property values in each row according to the appropriate zipcode.
So, for example:
zip value
2210 $5,000
2130 $3,000
2210 $2,100
2345 $1,000
I would then add up the values
$5,000 + $2,100 = $7,100
and reap the total property value of for zipcode 2210 as $7,100.
Any help in this regard will be appreciated
Upvotes: 0
Views: 1471
Reputation: 738
You need:
df
zip value
0 2210 5000
1 2130 3000
2 2210 2100
3 2345 1000
df2 = df.groupby(['zip'])['value'].sum()
df2
zip value
2130 3000
2210 7100
2345 1000
Name: value, dtype: int64
You can read more about it here.
Also you will need to remove the $ sign in the column values. For that you can use something along the lines of the following while reading the dataframe initially:
df = pd.read_csv('zip_value.csv', header=0,names=headers,converters={'value': lambda x: float(x.replace('$',''))})
Edit: Changed the code according to comment. To reset the index after groupby use:
df2 = df.groupby(['zip'])['value'].sum().reset_index()
Then to remove a particular column with zip value ,say, 2135 , you need
df3 = df2[df2['zip']!= 2135]
Upvotes: 1