Reputation: 3652
Hi I have a list of tuples containing a string and a numpy float 64 value. I would like to change it to spark dataframe. But I am getting errors. The list and error are show below.
This is my code:
schema = StructType([StructField("key", StringType(), True), StructField("value", DoubleType(), True)])
coef_df = spark.createDataFrame(coef_list, schema)
Upvotes: 0
Views: 784
Reputation: 3110
As @user6910411 suggests, Spark SQL doesn't support NumPy types (yet)
Here is a slightly more simple solution for you (incorporating the comment as well)
import numpy as np
data = [
(np.unicode('100912strategy_id'), np.float64(-2.1412)),
(np.unicode('10exchange_ud'), np.float64(-1.2412))]
df = (sc.parallelize(data)
.map(lambda x: (str(x[0]), float(x[1])))
.toDF(["key","value"]))
df.show()
+-----------------+-------+
| key| value|
+-----------------+-------+
|100912strategy_id|-2.1412|
| 10exchange_ud|-1.2412|
+-----------------+-------+
Upvotes: 2