meerkatopera
meerkatopera

Reputation: 25

drop all df2.columns from another df (pyspark.sql.dataframe.DataFrame specific)

I have a large DF (pyspark.sql.dataframe.DataFrame) that is a result of multiple joins, plus new columns being created by using a combination of inputs from different DFS, including DF2.

I want to drop all DF2 columns from DF after I'm done with the join/creating new columns based on DF2 input. drop() doesn't accept list - only a string or a Column.

I know that df.drop("col1", "col2", "coln") will work but I'd prefer not to crowd the code (if I can) by listing those 20 columns.

Is there a better way of doing this in pyspark dataframe specifically?

Upvotes: 1

Views: 443

Answers (1)

过过招
过过招

Reputation: 4224

drop_cols = df2.columns
df = df.drop(*drop_cols)

Upvotes: 4

Related Questions