Abhiram M V
Abhiram M V

Reputation: 3

Why am I getting ComputeError while performing left join between two Polars Dataframes?

The below code snippet is inside a function and the function is inside a class. I am using polars instead of pandas. I tried running the function and it showed me an error while performing the left join.

import polars as pl

inventory = pl.from_pandas(inventory)

# If a production order is provided, merge the inventory and production order information
if isinstance(self.production_order, pl.DataFrame) and (self.production_order.shape[0] > 0):
    inventory = inventory.join(self.production_order,on='material_number', how='left')

I am getting the below error

ComputeError: Joins/or comparisons on categorical dtypes can only happen if they are created under the same global string cache.Hint: set a global StringCache

I tried using 'with pl.StringCache():' and then perform the join but it still shows the same error. What can I do to fix it?

Upvotes: -1

Views: 387

Answers (1)

ritchie46
ritchie46

Reputation: 14730

You must set the string cache when the categoricals in your tables are created. This can be done for the whole script/program by setting pl.enable_string_cache() on the start of your program.

Or this can be done only while creating the tables and then cleaning up the string cache afterwards.

with pl.StringCache(): # set string cache for duration of inventory creation.
    inventory = pl.from_pandas(inventory)

if isinstance(self.production_order, pl.DataFrame) and (self.production_order.shape[0] > 0):
    inventory = inventory.join(self.production_order,on='material_number', how='left')

Upvotes: 0

Related Questions