Brian Feeny
Brian Feeny

Reputation: 451

How to update a specific DataFrame slice of a column with new values?

I have a DataFrame and it has a column 'pred' which is empty and I wish to update it with some specific values. They were originally in a numpy array but I stuck them in a Series called "this": print(type(predictions))

print(predictions)
['collection2' 'collection2' 'collection2' 'collection1' 'collection2'
 'collection1']

this = pd.Series(predictions, index=test_indices)

print(type(data))
<class 'pandas.core.frame.DataFrame'>

print(data.shape)
(35, 4)

print(data.iloc[test_indices])
     class         pred                                          text  \
223  collection2   []  Fellow-Citizens of the Senate and House of Rep...   
20   collection1   []  The period for a new election of a citizen to ...   
12   collection1   []  Fellow Citizens of the Senate and of the House...   
13   collection1   []  Whereas combinations to defeat the execution o...   
212  collection2   []  MR. PRESIDENT AND FELLOW-CITIZENS OF NEW-YORK:...   
230  collection2   []  Fellow-Countrymen:\nAt this second appearing t...   

                                                 title  
223                               First Annual Message  
20                                    Farewell Address  
12                    Fifth Annual Message to Congress  
13   Proclamation against Opposition to Execution o...  
212                               Cooper Union Address  
230                           Second Inaugural Address 

print(type(this))
<class 'pandas.core.series.Series'>

print(this.shape)
(6,)

print(this)
0    collection2
1    collection1
2    collection1
3    collection1
4    collection2
5    collection2

I thought I could do like:

data.iloc[test_indices, [4]] = this

but that results in

IndexError: positional indexers are out-of-bounds

or

data.ix[test_indices, ['pred']] = this
KeyError: '[0] not in index'

Upvotes: 3

Views: 8094

Answers (2)

Prakriti Gupta
Prakriti Gupta

Reputation: 111

I prefer .ix over .loc. You can use

data.ix[bool_series, 'pred'] = this

here, bool_series is a boolean series containing True for rows you want to update values for, and False otherwise. Example:

bool_series = ((data['col1'] > some_number) & (data['col2'] < some_other_number))

However, make sure you already have a 'pred' column before you use data.ix[bool_series, 'pred']. Otherwise, it will give an error.

Upvotes: 2

piRSquared
piRSquared

Reputation: 294288

Try:

data.loc[data.index[test_indices], 'pred'] = this

Upvotes: 5

Related Questions