KeyError when selecting specific values with two conditions

Question

My database has data about different plants in different days and hours, and some calculation thhat were done on them. I want to create "mini" datasets ,when each data set will contain data about different days and data. for example: dataset 1: all the plants, date:17/6, hour 12:00, Treatment, Line,NDVI.....

I have tried to do that using the code

df_07=df_indices[(df_indices['date']=='6/17/2019') & (df_indices['Hour'] > '12:00')]

but I keep get KeyError:

KeyError: 'date'

I don't know why it doesn't recognize date, because as you can ssee in the image, it has the column date. Can it happen because i'm using 'date' as an index?

My end goal is to create new tables that will have only one date and one hour for all the plants.

jezrael · Accepted Answer

Use Index.get_level_values:

m1 = (df_indices.index.get_level_values('date')=='6/17/2019')

For second mask convert values to timedeltas by to_timedelta and compare with Timedelta:

m2 = pd.to_timedelta(df_indices.index.get_level_values('Hour')+':00') > pd.Timedelta(12, 'h')

df_07=df_indices[m1 & m2]

Better solution is join date with hour level, so possible compare by datetimes:

Sample data:

df_indices = pd.DataFrame({'date':['6/17/2019','6/17/2019'],
                           'Hour':['15:00','10:00'],
                           'col':[4,5]}).set_index(['date','Hour'])


print (df_indices)
                 col
date      Hour      
6/17/2019 15:00    4
          10:00    5

dates = pd.to_datetime(df_indices.index.get_level_values('date') + ' ' + 
                       df_indices.index.get_level_values('Hour'))
print (dates)
DatetimeIndex(['2019-06-17 15:00:00', '2019-06-17 10:00:00'], 
              dtype='datetime64[ns]', freq=None)


df_07=df_indices[dates > '2019-06-17 12:00:00']
print (df_07)
                 col
date      Hour      
6/17/2019 15:00    4

KeyError when selecting specific values with two conditions

Answers (1)

Related Questions