abhinav singh
abhinav singh

Reputation: 1104

Matching string in column value returns none value

I am using Pyspark and have a dataset containing the column value "Company". I am trying to filter out results where Company matches "Microsoft".

Here is what I wrote:

new_df = file_df.filter(file_df.Company.str.contains('Microsoft', case=False, regex=True))
display(new_df)

This returns no results. I am not sure what is missing in my lines of code. Can someone guide me in the right direction.

Upvotes: 0

Views: 107

Answers (2)

Vaebhav
Vaebhav

Reputation: 5062

The Spark contains API does not allow case and regex in its signature

If you want the above regex capability, you can look into - rlike

Contains

from pyspark.sql import functions as F
file_df.filter(F.col('Company').contains('Microsoft'))

RLike

from pyspark.sql import functions as F
file_df.filter(F.col('Company').rlike('%Microsoft%'))

Upvotes: 1

Ghouse thanedar
Ghouse thanedar

Reputation: 56

display(file_df.filter("Company=='Microsoft'"))

Upvotes: 0

Related Questions