TS1000
TS1000

Reputation: 31

Pyspark: Trying to Convert a Column to binary using a 'greater than' boolean expression

Is there a way to create a new column that only holds values for something 'greater than 1'? There's a column for retweets and I need to make a new column that is binary. 0 for zero retweets, 1 for one retweet or more in pyspark.

Upvotes: 0

Views: 956

Answers (1)

mck
mck

Reputation: 42382

You can use

df.withColumn('greater_than_1', (F.col('retweets').cast('int') >= 1).cast('int'))

Upvotes: 1

Related Questions