Reputation: 1857
I have a StringType() column and an ArrayType(StringType()) column in a pyspark dataframe. I want to concat the StringType() column with every element of the ArrayType(StringType()) column.
Example:
+-----+---------------------+------------------------------+
|col1 |col2 |col3 |
+-----+---------------------+------------------------------+
|'AQQ'|['ABC', 'DEF'] |['AQQABC', 'AQQDEF'] |
|'APP'|['ABC', 'DEF', 'GHI']|['APPABC', 'APPDEF', 'APPGHI']|
+-----+---------------------+------------------------------+
Thanks :)
Upvotes: 0
Views: 70
Reputation: 13998
for spark 2.4+, use transform:
from pyspark.sql.functions import expr
df = spark.createDataFrame([('AQQ', ['ABC', 'DEF']),('APP', ['ABC', 'DEF', 'GHI'])], ['col1', 'col2'])
df.withColumn('col3', expr("transform(col2, x -> concat(col1, x))")).show(truncate=False)
+----+---------------+------------------------+
|col1|col2 |col3 |
+----+---------------+------------------------+
|AQQ |[ABC, DEF] |[AQQABC, AQQDEF] |
|APP |[ABC, DEF, GHI]|[APPABC, APPDEF, APPGHI]|
+----+---------------+------------------------+
Upvotes: 1