Reputation: 39

How to add a Python list to a Spark DataFrame?

I have a Python list of 10000*1. I want to add it to a Spark DataFrame, so that the DataFrame consists of 10000 rows. How do I do that?

Upvotes: 2

Answers (2)

Mariusz

Reputation: 13926

First, create dataframe from list:

new_df = spark.createDataFrame([(value,) for value in list], ['id'])

Then union both dataframes:

base.union(new_df).show()

Remember that column name and type in both dataframes must be the same.

Upvotes: 3

Zhang Tong

Reputation: 4719

It looks like you want to add literal value

from pyspark.sql import functions as f

df = spark.sparkContext.parallelize([('idx',)]).toDF()
res = df.withColumn('literal_col', f.lit('strings'))
res.show(truncate=False)

# output:
+---+-----------+
|_1 |literal_col|
+---+-----------+
|idx|strings    |
+---+-----------+

Upvotes: 0

How to add a Python list to a Spark DataFrame?

Answers (2)

Related Questions