Reputation: 451
I have a DataFrame and it has a column 'pred' which is empty and I wish to update it with some specific values. They were originally in a numpy array but I stuck them in a Series called "this": print(type(predictions))
print(predictions)
['collection2' 'collection2' 'collection2' 'collection1' 'collection2'
'collection1']
this = pd.Series(predictions, index=test_indices)
print(type(data))
<class 'pandas.core.frame.DataFrame'>
print(data.shape)
(35, 4)
print(data.iloc[test_indices])
class pred text \
223 collection2 [] Fellow-Citizens of the Senate and House of Rep...
20 collection1 [] The period for a new election of a citizen to ...
12 collection1 [] Fellow Citizens of the Senate and of the House...
13 collection1 [] Whereas combinations to defeat the execution o...
212 collection2 [] MR. PRESIDENT AND FELLOW-CITIZENS OF NEW-YORK:...
230 collection2 [] Fellow-Countrymen:\nAt this second appearing t...
title
223 First Annual Message
20 Farewell Address
12 Fifth Annual Message to Congress
13 Proclamation against Opposition to Execution o...
212 Cooper Union Address
230 Second Inaugural Address
print(type(this))
<class 'pandas.core.series.Series'>
print(this.shape)
(6,)
print(this)
0 collection2
1 collection1
2 collection1
3 collection1
4 collection2
5 collection2
I thought I could do like:
data.iloc[test_indices, [4]] = this
but that results in
IndexError: positional indexers are out-of-bounds
or
data.ix[test_indices, ['pred']] = this
KeyError: '[0] not in index'
Upvotes: 3
Views: 8094
Reputation: 111
I prefer .ix over .loc. You can use
data.ix[bool_series, 'pred'] = this
here, bool_series is a boolean series containing True for rows you want to update values for, and False otherwise. Example:
bool_series = ((data['col1'] > some_number) & (data['col2'] < some_other_number))
However, make sure you already have a 'pred' column before you use data.ix[bool_series, 'pred']. Otherwise, it will give an error.
Upvotes: 2