Reputation: 3577
I have a DataFrame like this:
Month Day Year TmaxF
4 1 1912 56.00
4 2 1912 56.00
4 3 1912 74.00
1 1 1913 38
1 2 1913 28
1 3 1913 21
1 1 1914 30.00
1 2 1914 31.00
1 3 1914 20.00
and I want to only select data that is in the years 1913 and 1914. .isin
isn't what I want here because this is a simplified dataset.
I am more looking for something like:
df.loc['1913':'1914']
but when I set Year
as the index and run this code it returns the error:
TypeError: cannot do slice indexing on <class 'pandas.core.index.Int64Index'> with these indexers [1913] of <type 'str'>
df.info() returns:
Month 36397 non-null int64
Day 36397 non-null int64
Year 36397 non-null int64
TmaxF 35600 non-null float64
Upvotes: 1
Views: 2823
Reputation: 5362
After you set the year as the index, use a slice
df.set_index('Year',inplace=True)
df.loc[slice('1913','1914'),:]
# Month Day TmaxF
#Year
#1913 1 1 38
#1913 1 2 28
#1913 1 3 21
#1914 1 1 30.00
#1914 1 2 31.00
#1914 1 3 20.00
Here I am using strings as the index:
df.index
#Index([u'1912', u'1912', u'1912', u'1913', u'1913', u'1913', u'1914', u'1914',
# u'1914'],
# dtype='object', name=u'Year')
It looks as your Year
column was originally integer, so maybe your index is
df.index
#Int64Index([1912, 1912, 1912, 1913, 1913, 1913, 1914, 1914, 1914], dtype='int64', name=u'Year')
If so, then make your slicer ranges integers: df.loc[slice(1913,1914),:]
Upvotes: 0
Reputation: 2361
First of all, notice that you data is number (int64) and not string. From the way you tried to query the data I believe that you followed a guide with dates as indexes (in which case you can slice by the date or part of it)
Putting that aside, it is important to remember that df.loc
is used to slice according to the index (which does not appear in the table you sent).
While you can set the year as index, the more elegant way to slice the data as you wish would be using "boolean indexing" :
df[(df.Year >= 1913) && (df.Year <= 1914)]
If you still insist on making the year as the index, this can be done as follows:
df.index = df.Year
df.loc[1913:1914]
Upvotes: 2