Reputation: 3
The below is the code that i am using to create a dataframe based on a condition. It is working fine when the value of the variable(my_var) is passed directly. But when the variable name is used pandas is giving a key error. Is there a way to overcome this issue ?
##df4 and df3 are pandas dataframe
print(my_var)
Output : (df3['Division Code'] == 0)|(df3['Division Code'] ==10)|(df3['Division Code'] ==20)|(df3['Division Code'] ==30)
df4=df3[(df3['Division Code'] == 0)|(df3['Division Code'] ==10)|(df3['Division Code'] ==20)|(df3['Division Code'] ==30)]
Output: Working fine and no errors
df4=df3[my_var]
Output: KeyError: "(df3['Division Code'] == 0)|(df3['Division Code'] ==10)|(df3['Division Code'] ==20)|(df3['Division Code'] ==30)"
Upvotes: 0
Views: 381
Reputation: 1155
I suppose that my_var
is a string variable which contains "(df3['Division Code'] == 0)|(df3['Division Code'] ==10)|(df3['Division Code'] ==20)|(df3['Division Code'] ==30)"
Because it is a string, your DataFrame is looking for a column, which has this (weird) name.
The better way would be defining my_var
as:
my_var = df3['Division Code'].isin((0,10,20,30))
df4=df3[my_var]
Upvotes: 1