Reputation: 8628
I join two PySpark DataFrames as follows:
exprs = [max(x) for x in ["col1","col2"]]
df = df1.union(df2).groupBy(['campk', 'ppk']).agg(*exprs)
But I get this error:
AssertionError: all exprs should be Column
What is wrong?
Upvotes: 9
Views: 29607
Reputation: 151
try below code from pyspark.sql import functions as F exprs = [F.max(x) for x in ["col1","col2"]] print(*exprs)
Upvotes: 1
Reputation: 10082
exprs = [max(x) for x in ["col1","col2"]]
will return character with max ASCII value ie ['o', 'o']
Refering the correct max
would work:
>>> from pyspark.sql import functions as F
>>> exprs = [F.max(x) for x in ["col1","col2"]]
>>> print(exprs)
[Column<max(col1)>, Column<max(col2)>]
Upvotes: 17