MarcelloDG
MarcelloDG

Reputation: 98

Cache data in Snowpark using modin.pandas

I'm new to Snowflake and I'm simply trying to run a python script that uses Pandas and numpy, simply using Snowpark. Following the documentation I've replaced the pandas import with modin.pandas and imported the snowpark plugin ( import snowflake.snowpark.modin.plugin).

The aim is to use the Snowflake backend without (or with extreme low effort) change my script. There are some points that are for sure not covered by the modin pandas interface and for which I've written some workarounds. Debug this is unfortunately impossible! Because of lazy evaluation all the logic is rerun every time, so to simply obtain the content of a variable (here idx) in the VSCode debug console I need to wait A LOT:

enter image description here

Here the question:

Is there any way to cache intermediate results (like in Spark) in order not to re-execute everything? There is a better way to debug it?

Thanks!

Upvotes: 1

Views: 90

Answers (0)

Related Questions