Reputation: 18648
How do I find the count of an element in a column grouped by a few columns using python pandas?
I have the following csv file structure:
'Country' 'City' 'Year' 'Month' 'Value' 'Street_Code'
USA New York 1971 jan 0.0 1
USA New York 1971 feb 23.5 1
USA New York 1971 mar 10.2 1
USA Florida 1971 jan 0.0 1
USA Florida 1971 feb 0.0 1
USA Florida 1971 mar 0.0 1
USA New York 1971 jan 0.0 2
USA New York 1971 feb 15.0 2
USA New York 1971 mar 7.6 2
USA Florida 1971 jan 0.0 2
USA Florida 1971 feb 0.0 2
USA Florida 1971 mar 2.3 2
I want to count the number of zeros (0.0) in 'value'
by grouping 'Country'
, 'City'
,'Year'
& 'Street Code'
.
I've tried this so far;
import pandas as pd
data = pd.read_csv('country_details.csv')
count_data = data[data['Value'] == 0.0] # I'm filtering the data. I don't think this is the right way of doing it
grouped = count_data.groupby(['Country','Year','Month','Street_Code']) # I'm stuck here
Upvotes: 1
Views: 2694
Reputation: 64443
Your filtering is almost right, but you need to reference the column name, in this case 'Value'.
Try:
import pandas as pd
import StringIO
csv = StringIO.StringIO("""Country,City,Year,Month,Value,Street_Code
USA,NewYork,1971,jan,0.0,1
USA,NewYork,1971,feb,23.5,1
USA,NewYork,1971,mar,10.2,1
USA,Florida,1971,jan,0.0,1
USA,Florida,1971,feb,0.0,1
USA,Florida,1971,mar,0.0,1
USA,NewYork,1971,jan,0.0,2
USA,NewYork,1971,feb,15.0,2
USA,NewYork,1971,mar,7.6,2
USA,Florida,1971,jan,0.0,2
USA,Florida,1971,feb,0.0,2
USA,Florida,1971,mar,2.3,2""")
data = pd.read_csv(csv)
datasub = data[data['Value'] == 0.0]
print datasub.groupby(['Country','Year','Month','Street_Code'])['Value'].count()
Country Year Month Street_Code
USA 1971 feb 1 1
2 1
jan 1 2
2 2
mar 1 1
Upvotes: 2