Data_Insight
Data_Insight

Reputation: 585

How to convert scala spark.sql.dataFrame to Pandas data frame

I wanted to Convert scala dataframe into pandas data frame

    val collection = spark.read.sqlDB(config)
    collection.show()

    #Should be like df=collection

Upvotes: 0

Views: 1988

Answers (2)

Selnay
Selnay

Reputation: 711

You are asking for a way of using a Python library from Scala. This is a bit weird to me. Are you sure you have to do that? Maybe you know that, but Scala DataFrames have a good API that will probably give you the functionality you need from pandas.

If you still need to use pandas, I would suggest you to write the data that you need to a file (a csv, for example). Then, using a Python application you can load that file into a pandas dataframe and work from there.

Trying to create a pandas object from Scala is probably overcomplicating things (and I am not sure it is currently possible).

Upvotes: 1

Ravi
Ravi

Reputation: 470

I think If you want to use pandas based API in SPARK code, then you can install Koalas-Python library. So, Whatever the function you want to use from pandas API directly you can embed them in SPARK code.

To install kolas

pip install koalas

Upvotes: 0

Related Questions