Reputation: 730
I am trying to initialize the instance and passing data frame, but for some reason I am getting the output
class TestReg:
def __init__(self, x, y, create_intercept=False):
self.x = x
self.y = y
if create_intercept:
self.x['intercept'] = 1
x = data[['class', 'year']]
y = data['performance']
reg = TestReg(x, y, create_intercept=True)
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy self.x['intercept'] = 1
Any idea what am I doing wrong ?
Upvotes: 2
Views: 3231
Reputation: 148965
You are trying to change values into an extract of a dataframe (a slice in pandas wordings).
After cleaning what you try to do is:
x = data[['class', 'year']] # x is a slice here
x['intercept'] = 1 # dangerous because behaviour is undefined => warning
Pandas can use either a copy or a view when you use a slice (here 2 columns from a DataFrame). It does not matter when you only read data, but it does if you try to change it, hence the warning.
You should pass the original dataframe and only make changes through it:
class TestReg:
def __init__(self, data, cols, y, create_intercept=False):
self.data = data
self.y = y
if create_intercept:
self.data['intercept'] = 1
cols.append['intercept']
self.x = data[cols]
...
reg = TestReg(data, ['class', 'year'], y, create_intercept=True)
Alternatively, you could force a copy if you do not want to change the original dataframe:
...
x = data[['class', 'year']].copy()
y = data['performance']
reg = TestReg(x, y, create_intercept=True)
Upvotes: 4