Reputation: 317
I have a spark dataframe that I created it by this way :
tx_df = (spark
.read
.parquet("/data/file"))
tx_ecommerce = tx_df.filter(tx_df["POS_Cardholder_Presence"]=="ECommerce").show()
I try to convert tx_commerce to pandas dataframe. I tryed like this :
tx_ecommerce.toPandas()
But I got this error :
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) in ----> 1 tx_ecommerce.toPandas()
AttributeError: 'NoneType' object has no attribute 'toPandas'
Any help please to resolve this problem?
thanks
Upvotes: 1
Views: 10512
Reputation: 1
you can do this to read a parquet file:
import pandas as pd
txt = pd.read_parquet("/data/file.parquet")
txt_ecommerce = txt.loc[txt.POS_Cardholder_Presence =="ECommerce"]
Upvotes: 0
Reputation: 636
when you put .show() at the end, it is not a pyspark data frame anymore.
Remove it and it should work.
tx_ecommerce =tx_df.filter(tx_df["POS_Cardholder_Presence"]=="ECommerce")
tx_ecommerce.toPandas()
Upvotes: 4