Error when creating a new column in a Pandas dataframe from two other columns

Question

I have the following (toy) data set:

import pandas as pd
import numpy as np

df = pd.DataFrame({'Manufacturer':['Allen Edmonds', 'Louis Vuitton 23', 'Louis Vuitton 8', 'Gulfstream', 'Bombardier', '23 - Louis Vuitton', 'Louis Vuitton 20'],
                   'System':['None', 'None', '14 Platinum', 'Gold', 'None', 'Platinum 905', 'None']
                  })

Next, I create a column named Manufacturer based on the two existing columns:

df.loc[(df['Manufacturer'].str.contains('Louis')) & 
       (df['System'].str.contains('Platinum')),
      'Pricing'] = 'East Coast'

On the toy data set, this approach works as expected. However, on the production data (which, unfortunately, I cannot share), I see the following error message:

KeyError:  "None of [Float64Index([nan, nan, nan, nan, nan, nan...], 
       dtype='float64', length=583)] are in the [index]"

At first, I thought that the error might be caused by whitespace in the column headers. But, it doesn't look like this is the case.

The column headers are assigned as follows:

for elem in elements:
    d = {
        'Manufacturer' : issue.fields.manufacturer,
        'System' : issue.fields.system
        }

(the data comes from a database)

Any idea what might be causing this Key Error?

Maybe I need to use an adaptation of:

df['Pricing'] = np.where(df['Manufacturer'].str.contains('Louis'), 'East Coast', 'None')

But, I'm not sure how to use np.where with two conditions... (see How to create a column in a Pandas dataframe based on a conditional substring search of one or more OTHER columns for my original question).

Thanks in advance!

Error when creating a new column in a Pandas dataframe from two other columns

Answers (1)

Related Questions