Volatil3
Volatil3

Reputation: 14988

PySpark: How do I count number of spaces between in a string?

I know it is doable in Python, but is there any built-in function or Like or IN like facility? For instance, if the name column contains John Doe then it should return 4 as space count.

Or should I create a UDF?

Upvotes: 2

Views: 710

Answers (1)

ZygD
ZygD

Reputation: 24396

A couple of options:

F.size(F.split('col_name', ' ')) - 1
F.length(F.regexp_replace('col_name', '[^ ]+', ''))

Spark 3.4+

F.expr("regexp_count(col_name, ' ')")

Upvotes: 5

Related Questions