SISI
SISI

Reputation: 1

Convert column to list by using when with 2 dataframe

I have two Dataframe and I want when the column CONCERN of Dataframe2 contains 'all' the anwser in the new column "EFFECTIFITY" (in the same dataframe) is a list off all the serial number "SN" of the column "SN" in the Dataframe1

df1 = Dataframe1 df2 = Dataframe2

all_data = df1.select(collect_list("SN")).show()

df = df.withColumn("EFFECTIVITY", F.when(df2.CONCERN.contains('ALL'), all_data).otherwise(''))

Upvotes: 0

Views: 49

Answers (1)

Mikey
Mikey

Reputation: 5

check below scenario. it may solve your problem,

from pyspark.sql.functions import collect_list, when

# create list and collect all the SN values from df1 into a list
all_data = df1.select(collect_list("SN")).first()[0]

df2 = df2.withColumn("EFFECTIVITY", when(df2.CONCERN.contains('ALL'), all_data).otherwise([]))

Upvotes: 0

Related Questions