user17241
user17241

Reputation: 317

from spark dataframe to pandas dataframe

I have a spark dataframe that I created it by this way :

tx_df = (spark
         .read
         .parquet("/data/file"))



tx_ecommerce = tx_df.filter(tx_df["POS_Cardholder_Presence"]=="ECommerce").show()

I try to convert tx_commerce to pandas dataframe. I tryed like this :

tx_ecommerce.toPandas()

But I got this error :

--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) in ----> 1 tx_ecommerce.toPandas()

AttributeError: 'NoneType' object has no attribute 'toPandas'

Any help please to resolve this problem?

thanks

Upvotes: 1

Views: 10512

Answers (2)

Melissa Ouneslii
Melissa Ouneslii

Reputation: 1

you can do this to read a parquet file:

import pandas as pd
txt = pd.read_parquet("/data/file.parquet")
txt_ecommerce = txt.loc[txt.POS_Cardholder_Presence =="ECommerce"]

Upvotes: 0

Yasi Klingler
Yasi Klingler

Reputation: 636

when you put .show() at the end, it is not a pyspark data frame anymore.

Remove it and it should work.

tx_ecommerce =tx_df.filter(tx_df["POS_Cardholder_Presence"]=="ECommerce")

tx_ecommerce.toPandas()

Upvotes: 4

Related Questions