Reputation: 23753
I am using a notebook
in Azure Databricks.
Question: In the following code, the key ProductName
exists but I am getting the error shown below. What could be a reason for this error and how can we resolve it? I tried some online suggestions but still no luck.
KeyError: ProductName
KeyError Traceback (most recent call last)
/databricks/python/lib/python3.8/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
3079 try:
-> 3080 return self._engine.get_loc(casted_key)
3081 except KeyError as err:
import sqlalchemy as sq
import pandas as pd
def fn_myFunction(prodname):
prod_name = prodname.replace('(','')
prod_name = prod_name.replace(')','')
prod_name = prod_name.strip()
return prod_name
data_df = pd.read_csv('/dbfs/FileStore/tables/myDataFile.txt', sep='~\*', engine='python', quotechar='"', header='infer')
data_df['ProductName'] = data_df['ProductName'].apply(lambda x: fn_myFunction(x))
.............
Upvotes: 1
Views: 1683
Reputation: 26676
I dont have AzureDatabricks open. I tried this in my databricks community pyspark and it works fine.
import pandas as pd
df1=spark.createDataFrame([
('123abc' , '(type 1)' , 1 ),
('456def' , 'type 1 ' , 1),
('789ghi' , 'type 2' , 0),
('101jkl' , 'type 3' , 0)
],
('id' , 'category' , 'flag'))
df= df1.toPandas()
function
def fn_myFunction(col):
col = col.str.replace('\(','', regex=True)
col = col.str.replace('\)','', regex=True)
col = col.str.strip()
return col
Solution
df['category'] = fn_myFunction(df['category'])
Output
id category flag
0 123abc type 1 1
1 456def type 1 1
2 789ghi type 2 0
3 101jkl type 3 0
Upvotes: 1