Reputation: 21
I have a dataframe like this:
+-------------------------------------------+
|url |
+-------------------------------------------+
|/v3/references/genders |
|/en/job/restaurant-manager-6619735/panels |
|/en/job-search/dealer-coordinator-jobs/ |
|/en/job/engineer-3034030/panels |
|/en/job/business-analyst-5385899 |
+-------------------------------------------+
I'm trying to get the count for each url that contains 'job'. I tried this but I got empty result.
df.createOrReplaceTempView("table")
spark.sql("select url, count(url) from table where url like 'job'").show()
What is wrong with that sql? Thanks!
Upvotes: 1
Views: 8791
Reputation: 15327
Try this.
spark.sql("select url, count(url) from table where url like '%job%' GROUP BY url").show()
Upvotes: 2