Reputation: 69
I have a python dictionary as below:
data = [{"cust_decision": "buy", "cust_details": "Easy to use"}, {"cust_decision": "buy", "cust_details": "econoimical"}, {"cust_decision":"no buy", "cust_details": "Didn’t like Product"}]
I am creating a pyspark df and temp view on this data as below:
from pyspark.sql import SparkSession, Row
spark.createDataFrame([Row(**i) for i in data]).createOrReplaceTempView("cust")
Now when I see the data of this temp view the special character ’ (This is not a single quote ' it's’) is changed to a different character â. Below is result
spark.table("cust").show(10,False)
+-------------+---------------------+
|cust_decision|cust_details |
+-------------+---------------------+
|buy |Easy to use |
|buy |econoimical |
|no buy |Didn’t like Product|
+-------------+---------------------+
But I'd like to get the character as is in every value. How can I achieve it?? Below is expected result:
+-------------+---------------------+
|cust_decision|cust_details |
+-------------+---------------------+
|buy |Easy to use |
|buy |econoimical |
|no buy |Didn’t like Product |
+-------------+---------------------+
Thanks ..
Upvotes: 1
Views: 157
Reputation: 31490
Try by decoding
your data dictionary to utf-8
data = [{"cust_decision": "buy", "cust_details": "Easy to use"}, {"cust_decision": "buy", "cust_details": "econoimical"}, {"cust_decision":"no buy", "cust_details": "Didn’t like Product"}]
decode_data=[{k: v.decode("utf-8") for k,v in i.items() } for i in data]
from pyspark.sql import SparkSession, Row
spark.createDataFrame([Row(**i) for i in decode_data]).createOrReplaceTempView("cust")
spark.table("cust").show(10,False)
#+-------------+-------------------+
#|cust_decision|cust_details |
#+-------------+-------------------+
#|buy |Easy to use |
#|buy |econoimical |
#|no buy |Didn’t like Product|
#+-------------+-------------------+
Upvotes: 1