teelove
teelove

Reputation: 71

Error with Pandas Profiling on Databricks using a dataframe

Is someone able to help me understand how to get pandas-profiling working with a dataframe.

using this post (Unable to run Pandas Profiling on Databricks)i was able to replicate the output using a dictionary, but when using a dataframe, i get the following errors

enter image description here

i have installed all the libraries with no error, i can view the dataframe with no issues, is this something to do with the storage location? i have read/write access to this location.

Upvotes: 1

Views: 462

Answers (1)

Alex Ott
Alex Ott

Reputation: 87154

You can't run Pandas profiler directly on the Spark dataframe - you need to create a Pandas dataframe using the .toPandas() function (doc), like this:

profile = ProfileReport(df.toPandas(), title='EDA Report', explorative=True)

Upvotes: 1

Related Questions