MontyP
MontyP

Reputation: 61

Error Setting Multi-Conditional Columns in Pandas DataFrame

My code is downloading an excel report and turning it into a dataframe that I then create a few columns based on existing column information. I wasn't experiencing issues before, but am getting this error now:

ValueError: cannot set using a multi-index selection indexer with a different length than the value

Here is a sample of the code. The error occurs on the first line:

df.loc[df['Blank'] != 'ENDING MY','Month'] = pd.DatetimeIndex(df['Date']).month
df.loc[(df['Blank'] == 'ENDING MY') & (df['Commodity'] == 'All Upland Cotton'),'Month'] = 7
df.loc[(df['Blank'] == 'ENDING MY') & (df['Commodity'] == 'All Wheat'),'Month'] = 5
df.loc[(df['Blank'] == 'ENDING MY') & (df['Commodity'] == 'Corn'),'Month'] = 8
df.loc[(df['Blank'] == 'ENDING MY') & (df['Commodity'] == 'Soybeans'),'Month'] = 8
df.loc[(df['Blank'] == 'ENDING MY') & (df['Commodity'] == 'Sorghum'),'Month'] = 8

There are only three potential variables in the 'Blank' column: it is empty, STARTING MY, or ENDING MY. This particular data pull has both ENDING MY and STARTING MY which might be different than when I was testing it.

However, because the code is outputting an error on the first line, the two options are STARTING MY and blank. Before we had blank and not STARTING MY so I tried a line of code before this that was simply:

df.loc[df['Blank'] == 'STARTING MY','Month'] = pd.DatetimeIndex(df['Date']).month

The error threw again on that line.

Does anyone have any ideas why it is causing this issue and what I can do to fix it?

Sample columns from dataframe: ** are column names. The month column is calculated. In this case soybeans are supposed to be month 8 as well.

**Commodity** **Blank** **Value1** **Value 2** **Value 3** **Date**    **Month**
All Wheat                   1           3          4       2020-08-17      8
All Wheat                   4           4          2       2020-08-17      8
Corn                        1           12         5       2020-08-17      8
Corn                        4           24         5       2020-08-17      8
Soybeans      ENDING MY     2           34         24      2020-08-17      8
Soybeans      ENDING MY     34          2          34      2020-08-17      8
Sorghum       STARTING MY   4           45         3       2020-08-17      8
Sorghum       STARTING MY   4           34         4       2020-08-17      8

Upvotes: 0

Views: 119

Answers (1)

NYC Coder
NYC Coder

Reputation: 7594

IIUC, you need to do this, convert the Date column to datetime first then, set the value:

df['Date'] = pd.to_datetime(df['Date'])
df.loc[df['Blank'] == 'STARTING MY','Month'] = df['Date'].dt.month
print(df)

   Commodity        Blank  Value1  Value 2  Value 3       Date  Month
0  All Wheat          NaN       1        3        4 2020-08-17    NaN
1  All Wheat          NaN       4        4        2 2020-08-17    NaN
2       Corn          NaN       1       12        5 2020-08-17    NaN
3       Corn          NaN       4       24        5 2020-08-17    NaN
4   Soybeans    ENDING MY       2       34       24 2020-08-17    NaN
5   Soybeans    ENDING MY      34        2       34 2020-08-17    NaN
6    Sorghum  STARTING MY       4       45        3 2020-08-17    8.0
7    Sorghum  STARTING MY       4       34        4 2020-08-17    8.0

Upvotes: 1

Related Questions