Reputation: 4838
Why Can I do a selection by month in this case, but not a selection by date ?
dates = pd.date_range( start = "01/01/1931" , end = "01/02/1941" )
new_df_4 = new_df_3.reindex(dates)
new_df_4["1931-10"][![enter image description here][1]][1]
But this doesn't work :
new_df_4["1931-10-02"]
KeyError Traceback (most recent call last) in () ----> 1 new_df_4["1931-10-02"]
/Users/romain/anaconda/lib/python2.7/site-packages/pandas/core/frame.pyc in __getitem__(self, key)
1990 return self._getitem_multilevel(key)
1991 else:
-> 1992 return self._getitem_column(key)
1993
1994 def _getitem_column(self, key):
/Users/romain/anaconda/lib/python2.7/site-packages/pandas/core/frame.pyc in _getitem_column(self, key)
2002 result = self._constructor(self._data.get(key))
2003 if result.columns.is_unique:
-> 2004 result = result[key]
2005
2006 return result
/Users/romain/anaconda/lib/python2.7/site-packages/pandas/core/frame.pyc in __getitem__(self, key)
1990 return self._getitem_multilevel(key)
1991 else:
-> 1992 return self._getitem_column(key)
1993
1994 def _getitem_column(self, key):
/Users/romain/anaconda/lib/python2.7/site-packages/pandas/core/frame.pyc in _getitem_column(self, key)
1997 # get column
1998 if self.columns.is_unique:
-> 1999 return self._get_item_cache(key)
2000
2001 # duplicate columns & possible reduce dimensionality
/Users/romain/anaconda/lib/python2.7/site-packages/pandas/core/generic.pyc in _get_item_cache(self, item)
1343 res = cache.get(item)
1344 if res is None:
-> 1345 values = self._data.get(item)
1346 res = self._box_item_values(item, values)
1347 cache[item] = res
/Users/romain/anaconda/lib/python2.7/site-packages/pandas/core/internals.pyc in get(self, item, fastpath)
3223
3224 if not isnull(item):
-> 3225 loc = self.items.get_loc(item)
3226 else:
3227 indexer = np.arange(len(self.items))[isnull(self.items)]
/Users/romain/anaconda/lib/python2.7/site-packages/pandas/indexes/base.pyc in get_loc(self, key, method, tolerance)
1876 return self._engine.get_loc(key)
1877 except KeyError:
-> 1878 return self._engine.get_loc(self._maybe_cast_indexer(key))
1879
1880 indexer = self.get_indexer([key], method=method, tolerance=tolerance)
pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:4027)()
pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:3891)()
pandas/hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12408)()
pandas/hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12359)()
KeyError: '1931-10-02'
Upvotes: 1
Views: 2080
Reputation: 862641
For select by month use partial string indexing:
print (new_df_4["1931-10"])
This won't work if the resolutions are the same (from the same docs):
Warning However if the string is treated as an exact match, the selection in DataFrame‘s [] will be column-wise and not row-wise, see Indexing Basics. For example dft_minute['2011-12-31 23:59'] will raise KeyError as '2012-12-31 23:59' has the same resolution as index and there is no column with such name: To always have unambiguous selection, whether the row is treated as a slice or a single selection, use .loc.
In [95]: dft_minute.loc['2011-12-31 23:59'] Out[95]: a 1 b 4 Name: 2011-12-31 23:59:00, dtype: int64
You can use loc
if need select by date:
new_df_4.loc["1931-10-02"]
Sample:
np.random.seed(10)
dates = pd.date_range( start = "01/01/1931" , end = "01/02/1941" )
new_df_4 = pd.DataFrame({'a':np.random.randint(10, size=len(dates))}, index=dates)
print (new_df_4.head())
a
1931-01-01 9
1931-01-02 4
1931-01-03 0
1931-01-04 1
1931-01-05 9
print (new_df_4["1931-10"])
a
1931-10-01 9
1931-10-02 6
1931-10-03 9
1931-10-04 7
1931-10-05 8
1931-10-06 0
1931-10-07 9
1931-10-08 6
1931-10-09 0
1931-10-10 1
1931-10-11 0
...
print (new_df_4.loc["1931-10-02"])
a 6
Name: 1931-10-02 00:00:00, dtype: int32
Upvotes: 4