SamCie
SamCie

Reputation: 131

SettingWithCopyWarning on index

When I try to create a column of mdates from the index(which contains datetime64[ns]), via:

df['mdates'] = mdates.date2num(df.index)

I get the:

SettingWithCopyWarning

How do I fix this. Normally I'd use df.loc but how can you use df.loc on the index?

Upvotes: 0

Views: 83

Answers (2)

SamCie
SamCie

Reputation: 131

The issue only occurs when I pass the df to a function, slice a range of dates via:

df.loc[start_date:end_date]

and then try to create a:

df['mdates'] = mdates.date2num(df.index)

I solved the issue by copying the df when its passed inside the function. First line of code in the function is:

df = df.copy()

May not be the most efficient way to do it but it solved the problem. If someone can explain why it throws an error without this bit of code I'd be grateful.

Upvotes: 1

tnwei
tnwei

Reputation: 962

Given the information in your own answer, it appears that the SettingWithCopyWarning arose not because of the mdates.date2num function, but because of the date slicing beforehand.

The purpose of the SettingWithCopyWarning is to alert you to potentially unintended behaviour when performing chained assignment. Chained assignment occurs when you are assigning values on a dataframe that is a product of chained indexing. In this case, chained assignment occurs because you are assigning to the mdates column in df, where df is a copy of the original df passed into the function.

Where does the unintended behaviour come from? When performing chained assignment, pandas can either return a slice or a copy. When you edit a slice, the original dataframe will be affected. When you edit a copy however, the original dataframe will not be affected. Until an indexing operation is performed, it is not easy to tell if a slice or a copy will be returned.

In this case, slicing df resulted in a copy. Pandas knows however that the sliced df is derived from the original df, and keeps track of that using a weak reference (https://docs.python.org/3/library/weakref.html). that is linked to the original df via a weak reference. It is not clear to pandas if you intended for the new column to affect the original df, therefore a SettingWithCopyWarning is raised.

When you explicitly create a copy of the dataframe, pandas knows that you want them to be treated separately, and thus any value assignment on the copied dataframe should not affect the original dataframe. Thus, no SettingWithCopyWarning is raised.

Source: https://tnwei.github.io/posts/setting-with-copy-warning-pandas/

Upvotes: 1

Related Questions