Reputation: 15
from pyspark.sql import Row
df = spark.sparkContext.parallelize([
Row(name='Angel', age=5, height=None,weight=40,desc = "Where is Angel"),
Row(name='Bobby', age=None, height=40,weight=50,desc = "This is Bobby")
]).toDF()
df.select(map(col("desc"), col("age")).alias("complex_map"))\
.selectExpr("explode(complex_map)").show(2)
while running the above code geting an error : TypeError: Column is not iterable
Please let me know where I am going wrong.
Upvotes: 1
Views: 637
Reputation: 42412
You need to use the create_map
function, not the native Python map
:
import pyspark.sql.functions as F
df.select(F.create_map(F.col("desc"), F.col("age")).alias("complex_map"))\
.selectExpr("explode(complex_map)").show(2)
To simplify the code further,
df.select(
F.explode(
F.create_map(F.col("desc"), F.col("age"))
).alias("complex_map")
).show(2)
Upvotes: 3