Reputation: 21
unGrouped
is a dataframe that looks like this:
date borough
969609 01/01/2014 BROOKLYN
967983 01/01/2014 BRONX
967982 01/01/2014 QUEENS
865943 01/01/2014 BROOKLYN
967981 01/01/2014 MANHATTAN
967980 01/01/2014 BROOKLYN
967979 01/01/2014 QUEENS
967984 01/01/2014 BRONX
967978 01/01/2014 QUEENS
967976 01/01/2014 BROOKLYN
967975 01/01/2014 BROOKLYN
I have the following code:
for row in unGrouped:
if unGrouped['borough'][row]=='BRONX':
bronxCount+=1
print bronxCount
And it gives me a Key Error: date
.
I would like to iterate through the column borough
, increment bronxCount
whenever it comes across BRONX
, and store that value for each row in a column called `bronxCount', to eventually get a count of crimes in the bronx for each day. If anyone could get this loop to work I would greatly appreciate it. Thanks for the help!
Upvotes: 0
Views: 569
Reputation: 85442
You can sum up after filtering:
>>> (unGrouped.borough == 'BRONX').sum()
2
To get the counts per date just group by date and boroug before counting:
>>> unGrouped.groupby(['date', 'borough']).size()
date borough
01/01/2014 BRONX 2
BROOKLYN 5
MANHATTAN 1
QUEENS 3
dtype: int64
or if you only want BRONX
with a date index.
>>> unGrouped.groupby(['borough', 'date']).size().loc['BRONX']
date
01/01/2014 2
dtype: int64
Upvotes: 2
Reputation: 323226
Using numpy
v, n = np.unique(df.borough.values, return_counts=True)
d=dict(zip(v, n))
d['BRONX']
Out[218]: 2
Upvotes: 0
Reputation: 153460
Use values_counts
:
bronxCount = unGrouped.borough.value_counts()['BRONX']
print(BronxCount)
Output:
2
Upvotes: 0
Reputation: 11347
Generally if you're using a for loop you're probably doing it wrong!
What you probably want is a groupby and count?
unGrouped.groupBy('borough').size()
Upvotes: 1