Reputation: 45
I want to single out all the data from 2020 in a dataset. The date is formatted like this: 06/10/2011 06:47:44 PM
I tried this out just to see if I could single out the data and count it, but it didn't work:
count = 0
d2020 = data311['Created Date'][6][-13]
for d in d2020:
if d == '2020':
count += 1
print(count)
It runs but doesn't return anything, if that makes sense. I've tried other ways, but the only difference is that it'll return 0, which is obviously incorrect.
I'm not actually trying to count the data from 2020, I just want to use only 2020 data to be able to plot it on a map.
This is the dataset: https://data.cityofnewyork.us/Social-Services/311-Noise-Complaints/p5f6-bkga
Upvotes: 2
Views: 274
Reputation: 106445
If your dates are always formatted like 06/10/2011 06:47:44 PM
as given in your question, you can certainly use a fixed slice of the string as you have attempted, except that instead of:
data911['Created Date'][6][-13]
you should use the slice opeartor :
:
data911['Created Date'][6:-12]
Or to do it the more elegant way, you can use datetime.datetime.strptime
to parse it into a datetime
object and then get its year
attribute:
from datetime import datetime
datetime.strptime(data911['Created Date'], '%m/%d/%Y %H:%M:%S %p').year
Upvotes: 1
Reputation: 10624
If df is your dataframe and 'Created Date' is your column, you can select the rows of 2020 by this (the key is the usage of .str in filtering/boolean indexing):
newdf=df[df['Created Date'].str[6:10]=='2020']
Upvotes: 1