Reputation: 31
Is there a way to create a new column that only holds values for something 'greater than 1'? There's a column for retweets and I need to make a new column that is binary. 0 for zero retweets, 1 for one retweet or more in pyspark.
Upvotes: 0
Views: 956
Reputation: 42382
You can use
df.withColumn('greater_than_1', (F.col('retweets').cast('int') >= 1).cast('int'))
Upvotes: 1