Ignacio RB
Ignacio RB

Reputation: 33

How can I filter a data frame with a datetime index by a month and year input? Pandas

Given a df like this:

df=pd.read_csv(PATH + 'Matriz3_fechas.csv',index_col='Fecha',skiprows=0)
df.index = pd.DatetimeIndex(df.index)

Note that Fecha is already the index with datetime format**

 Fecha                D576972dc305aa  D576972dc32e9a  D576972dc3590a  
                                                             
2016-06-01 00:00:00         0.0          0.0               0.1  
2016-07-01 00:05:00         0.0          0.0               0.1  
2017-05-01 00:10:00         0.0          0.0               0.1  
2017-05-01 00:15:00         0.0          0.0               0.1                                                              
2017-07-01 00:20:00         0.0          0.0               0.1  
                                                                 

I´ve tried to filter by month and year:

df=df[(df.index.month==5)&(matriz.index.year==2017)]

But it wont filter the df to get: (desired result)

 Fecha                D576972dc305aa  D576972dc32e9a  D576972dc3590a  \
                                                             
2017-05-01 00:10:00         0.0          0.0               0.1  \
2017-05-01 00:15:00         0.0          0.0               0.1  \

Upvotes: 2

Views: 7450

Answers (1)

jezrael
jezrael

Reputation: 862511

You can use partial string indexing:

#for datetimeindex use parameter parse_dates 
df=pd.read_csv(PATH+'Matriz3_fechas.csv',index_col='Fecha',skiprows=0,parse_dates=['Fecha'])

print (df.index)
DatetimeIndex(['2016-06-01 00:00:00', '2016-07-01 00:05:00',
               '2017-05-01 00:10:00', '2017-05-01 00:15:00',
               '2017-07-01 00:20:00'],
              dtype='datetime64[ns]', name='Fecha', freq=None)


df = df.loc['2017-05']
print (df)
                     D576972dc305aa  D576972dc32e9a  D576972dc3590a
Fecha                                                              
2017-05-01 00:10:00             0.0             0.0             0.1
2017-05-01 00:15:00             0.0             0.0             0.1

But your solution also works (if matriz is df, I think typo):

df=df[(df.index.month==5)&(df.index.year==2017)]
print (df)
                     D576972dc305aa  D576972dc32e9a  D576972dc3590a
Fecha                                                              
2017-05-01 00:10:00             0.0             0.0             0.1
2017-05-01 00:15:00             0.0             0.0             0.1

Upvotes: 4

Related Questions