Reputation: 25
I have a large DF (pyspark.sql.dataframe.DataFrame
) that is a result of multiple joins, plus new columns being created by using a combination of inputs from different DFS, including DF2.
I want to drop all DF2 columns from DF after I'm done with the join/creating new columns based on DF2 input.
drop()
doesn't accept list - only a string or a Column.
I know that df.drop("col1", "col2", "coln")
will work but I'd prefer not to crowd the code (if I can) by listing those 20 columns.
Is there a better way of doing this in pyspark dataframe specifically?
Upvotes: 1
Views: 443