Reputation: 37
I want to takeout any value which is before (Impressions)
.
Ex if i have value YouTube TrueView for Reach (Impressions)
, I will need YouTube TrueView for Reach
.
Another example is YouTube Bumper (Impressions)
--> YouTube Bumper
I am currently using :
validated_df=validated_df.withColumn("MediaNm", when(col("MediaNm").like("%Impressions%"),F.regexp_extract(F.col("MediaNm"), r".*?\(", 0)).otherwise(validated_df.MediaNm))
I am getting blank as a result of this.
Upvotes: 1
Views: 956
Reputation: 9277
If I understood correctly, you just want to remove the string ' (Impressions)'
: for this, you just need a regexp_replace
validated_df.withColumn('MediaNm', F.regexp_replace('MediaNm', ' \(Impressions\)', ''))
+--------------------------+
|MediaNm |
+--------------------------+
|YouTube TrueView for Reach|
|YouTube Bumper |
+--------------------------+
Upvotes: 1