Bjay
Bjay

Reputation: 69

Creating Pyspark dataframe on a python dictonary with special character

I have a python dictionary as below:

data = [{"cust_decision": "buy", "cust_details": "Easy to use"}, {"cust_decision": "buy", "cust_details": "econoimical"}, {"cust_decision":"no buy", "cust_details": "Didn’t like Product"}]

I am creating a pyspark df and temp view on this data as below:

from pyspark.sql import SparkSession, Row
spark.createDataFrame([Row(**i) for i in data]).createOrReplaceTempView("cust")

Now when I see the data of this temp view the special character (This is not a single quote ' it's’) is changed to a different character â. Below is result

spark.table("cust").show(10,False)
+-------------+---------------------+                                           
|cust_decision|cust_details         |
+-------------+---------------------+
|buy          |Easy to use          |
|buy          |econoimical          |
|no buy       |Didn’t like Product|
+-------------+---------------------+ 

But I'd like to get the character as is in every value. How can I achieve it?? Below is expected result:

+-------------+---------------------+                                           
|cust_decision|cust_details         |
+-------------+---------------------+
|buy          |Easy to use          |
|buy          |econoimical          |
|no buy       |Didn’t like Product  |
+-------------+---------------------+ 

Thanks ..

Upvotes: 1

Views: 157

Answers (1)

notNull
notNull

Reputation: 31490

Try by decoding your data dictionary to utf-8

data = [{"cust_decision": "buy", "cust_details": "Easy to use"}, {"cust_decision": "buy", "cust_details": "econoimical"}, {"cust_decision":"no buy", "cust_details": "Didn’t like Product"}]

decode_data=[{k: v.decode("utf-8") for k,v in i.items() } for i in data] 

from pyspark.sql import SparkSession, Row
spark.createDataFrame([Row(**i) for i in decode_data]).createOrReplaceTempView("cust")

spark.table("cust").show(10,False)
#+-------------+-------------------+
#|cust_decision|cust_details       |
#+-------------+-------------------+
#|buy          |Easy to use        |
#|buy          |econoimical        |
#|no buy       |Didn’t like Product|
#+-------------+-------------------+

Upvotes: 1

Related Questions