Nuhaa All Bakry
Nuhaa All Bakry

Reputation: 21

using wildcard in Spark SQL

I have a dataframe like this:

+-------------------------------------------+
|url                                        |
+-------------------------------------------+
|/v3/references/genders                     |
|/en/job/restaurant-manager-6619735/panels  |
|/en/job-search/dealer-coordinator-jobs/    |
|/en/job/engineer-3034030/panels            |
|/en/job/business-analyst-5385899           |
+-------------------------------------------+

I'm trying to get the count for each url that contains 'job'. I tried this but I got empty result.

df.createOrReplaceTempView("table")
spark.sql("select url, count(url) from table where url like 'job'").show()

What is wrong with that sql? Thanks!

Upvotes: 1

Views: 8791

Answers (1)

abaghel
abaghel

Reputation: 15327

Try this.

spark.sql("select url, count(url) from table where url like '%job%' GROUP BY url").show()

Upvotes: 2

Related Questions