Reputation: 39
I have a Python list of 10000*1. I want to add it to a Spark DataFrame, so that the DataFrame consists of 10000 rows. How do I do that?
Upvotes: 2
Views: 9322
Reputation: 13926
First, create dataframe from list:
new_df = spark.createDataFrame([(value,) for value in list], ['id'])
Then union both dataframes:
base.union(new_df).show()
Remember that column name and type in both dataframes must be the same.
Upvotes: 3
Reputation: 4719
It looks like you want to add literal value
from pyspark.sql import functions as f
df = spark.sparkContext.parallelize([('idx',)]).toDF()
res = df.withColumn('literal_col', f.lit('strings'))
res.show(truncate=False)
# output:
+---+-----------+
|_1 |literal_col|
+---+-----------+
|idx|strings |
+---+-----------+
Upvotes: 0