nam
nam

Reputation: 23753

KeyError - but the key is there

I am using a notebook in Azure Databricks.

Question: In the following code, the key ProductName exists but I am getting the error shown below. What could be a reason for this error and how can we resolve it? I tried some online suggestions but still no luck.

KeyError: ProductName

KeyError                                  Traceback (most recent call last)
/databricks/python/lib/python3.8/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3079             try:
-> 3080                 return self._engine.get_loc(casted_key)
   3081             except KeyError as err:
import sqlalchemy as sq
import pandas as pd

def fn_myFunction(prodname):
  prod_name = prodname.replace('(','')
  prod_name = prod_name.replace(')','')
  prod_name = prod_name.strip()
  return prod_name 

data_df = pd.read_csv('/dbfs/FileStore/tables/myDataFile.txt', sep='~\*', engine='python', quotechar='"', header='infer')
data_df['ProductName'] = data_df['ProductName'].apply(lambda x: fn_myFunction(x))
.............

Upvotes: 1

Views: 1683

Answers (1)

wwnde
wwnde

Reputation: 26676

I dont have AzureDatabricks open. I tried this in my databricks community pyspark and it works fine.

import pandas as pd
df1=spark.createDataFrame([
  ('123abc' ,  '(type 1)'   ,  1 ),
('456def' ,  'type 1 ' ,   1),
('789ghi'  , 'type 2'   , 0),
('101jkl' ,  'type 3'   ,  0)
  
],

('id'    ,   'category' , 'flag'))

df= df1.toPandas()

function

def fn_myFunction(col):
  col = col.str.replace('\(','', regex=True)
  col = col.str.replace('\)','', regex=True)
  col = col.str.strip()
  return col 

Solution

df['category'] = fn_myFunction(df['category'])

Output

     id category  flag
0  123abc   type 1     1
1  456def   type 1     1
2  789ghi   type 2     0
3  101jkl   type 3     0

Upvotes: 1

Related Questions