Reputation: 2998
I have a dataframe with single row and multiple columns. I would like it to convert it into multiple rows. I had found a similar question here on the stackoverflow.
The question answers how it can be done in scala but I wanted to do this in pyspark. I tried to replicate the code in pyspark but I wasn't able to do that.
I am not able to convert the below code in scala to python:
import org.apache.spark.sql.Column
var ColumnsAndValues: Array[Column] = df.columns.flatMap { c => {Array(lit(c), col(c))}}
val df2 = df1.withColumn("myMap", map(ColumnsAndValues: _*))
Upvotes: 0
Views: 1210
Reputation: 32660
In Pyspark you can use create_map
function to create map column. And a list comprehension with itertools.chain
to get the equivalent of scala flatMap :
import itertools
from pyspark.sql import functions as F
columns_and_values = itertools.chain(*[(F.lit(c), F.col(c)) for c in df1.columns])
df2 = df1.withColumn("myMap", F.create_map(*columns_and_values))
Upvotes: 1