Pandas dataframe indexing anomaly?

Question

The Python code below first creates a multi-indexed pandas dataframe then attempts to change one of its elements. The element in question is printed before and after the change to verify that the change worked. The problem is that it does not work. Please have a look at this code and let me know what the problem is.

import numpy as np
import pandas as pd
arrays = [['Apple','Apple','Banana','Banana','Cherry','Cherry'],
         ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
df = pd.DataFrame(np.zeros([3, 6]), index=['A', 'B', 'C'], columns=index)
df.insert(0, 'Insert', [1,2,3]) # the absence of this line makes the problem disappear
print df['Apple']['one']['A'] # this line correctly prints 0
df['Apple']['one']['A'] = 15
print df['Apple']['one']['A'] # this line again prints 0 when we should get 15 now

Woody Pride · Accepted Answer

You need to do the following:

df.loc['A', ('Apple', 'one')] = 15

It is not an anomaly, you are doing 'chained assignment' which basically only changes a copy of the underlying data. Someone will be able to tell you more accurately what is going on, but to index properly use .loc or .ix.

See this answer:

How to deal with SettingWithCopyWarning in Pandas?

Pandas dataframe indexing anomaly?

Answers (1)

Related Questions