Reputation: 114230
I have a dataframe with some columns:
>>> np.random.seed(0xFEE7)
>>> df = pd.DataFrame({'A': np.random.randint(10, size=10),
'B': np.random.randint(10, size=10),
'C': np.random.choice(['A', 'B'], size=10)})
>>> df
A B C
0 0 0 B
1 4 0 B
2 6 6 A
3 8 3 B
4 0 2 A
5 8 4 A
6 4 1 B
7 8 7 A
8 4 4 A
9 1 1 A
I also have a boolean series that matches part of the index of df
:
>>> g = df.groupby('C').get_group('A')
>>> ser = g['B'] > 5
>>> ser
2 True
4 False
5 False
7 True
8 False
9 False
Name: B, dtype: bool
I'd like to be able to use ser
to set or extract data from df
. For example:
>>> df.loc[ser, 'A'] -= 3
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\jfoxrabinovitz\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexing.py", line 1762, in __getitem__
return self._getitem_tuple(key)
File "C:\Users\jfoxrabinovitz\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexing.py", line 1289, in _getitem_tuple
retval = getattr(retval, self.name)._getitem_axis(key, axis=i)
File "C:\Users\jfoxrabinovitz\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexing.py", line 1914, in _getitem_axis
return self._getbool_axis(key, axis=axis)
File "C:\Users\jfoxrabinovitz\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexing.py", line 1782, in _getbool_axis
key = check_bool_indexer(labels, key)
File "C:\Users\jfoxrabinovitz\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexing.py", line 2317, in check_bool_indexer
raise IndexingError(
pandas.core.indexing.IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match).
The error makes sense since ser
is not the same length as df
. How do I tell the dataframe to update the rows that match the index of ser
and are set to True
?
Specifically, I am looking to modify entries at indices 2 and 7 only:
>>> df # after modification
A B C
0 0 0 B
1 4 0 B
2 3 6 A
3 8 3 B
4 0 2 A
5 8 4 A
6 4 1 B
7 5 7 A
8 4 4 A
9 1 1 A
Upvotes: 3
Views: 763
Reputation: 75080
Since the index of ser
doesnot match with the original dataframe, you get that error.
You can solve it 2 ways:
either use series.reindex
with a fill_value
of False
(boolean) and then use loc
so the indexes are aligned.
df.loc[ser.reindex(df.index,fill_value=False),'A'] = ... #setvalue
Or you can boolean index the ser
series so it returns only the True
values and gran the index which you can use with loc
:
df.loc[ser[ser].index,'A'] = ... #setvalue
Upvotes: 4
Reputation: 313
I guess you could just add index to ser
inside the loc
since both come from a common index.
df.loc[ser.index, 'A'] -= 3
As commented by @Shubham Sharma, the OP required to filter only the True
values. This approach get all indexes wih 'A'
.
@anky provided a way for that as:
df.loc[ser[ser].index, 'A'] -= 3
Upvotes: 2