Sologoub
Sologoub

Reputation: 5352

Spark alternative for Redshift REGEXP_SUBSTR function

Trying to convert part of Redshift query into SparkSQL or some combo of SQL and UDF:

REGEXP_SUBSTR(referrer, '[^/]+\\.[^/:]+') as referrer_domain,

Tried using regexp_extract(referrer, '[^/]+\\.[^/:]+', 1), but that doesn’t seem to work the same way and returns results inconsistently.

Any pointers appreciated!

Upvotes: 1

Views: 4192

Answers (1)

Kumar Vaibhav
Kumar Vaibhav

Reputation: 2642

You should be able to use regexp_extract in spark sql, something like this -

regexp_extract(columnName, '(YourRegex)', 1) as aliasName

Note the () around regex to capture the group. Hope it helps!

Upvotes: 2

Related Questions