sectechguy
sectechguy

Reputation: 2117

Selecting string after the last \\ using regex with Impala SQL

I have a dataset with a column with processes and the path. I am trying to use regex with Impala to strip off the executable. The dataset looks like this:

C:\\Windows\\System32\\svchost.exe
C:\\Windows\\System32\\conhost.exe
C:\\Windows\\System32\\net1.exe
C:\\Windows\\System32\\schtasks.exe
C:\\Program Files (x86)\\Citrix\\ICA Client\\SelfServicePlugin\\SelfService.exe
C:\\Windows\\System32\\backgroundTaskHost.exe
C:\\Windows\\System32\\net.exe
C:\\Windows\\System32\\conhost.exe
C:\\Program Files (x86)\\Wireless AutoSwitch\\wrlssw.exe

Desired output:

svchost.exe
conhost.exe
net1.exe
schtasks.exe
SelfService.exe
backgroundTaskHost.exe
net.exe
conhost.exe
wrlssw.exe

I have tried many queries like the two below but keep running into errors

select regexp_extract(w.destinationprocessname, '([^\\]+)$')
from winworkstations_realtime w
where w.externalid = '4688'
limit 10

Error:

AnalysisException: No matching function with signature: regexp_replace(STRING, STRING).

select regexp_extract(w.destinationprocessname, '\\(?:.(?!\\))+$',0)
from winworkstations_realtime w
where w.externalid = '4688'
limit 10

Error:

Could not compile regexp pattern: \(?:.(?!\))+$ Error: invalid perl operator: (?!

Looking for some guidance from anyone that is good with impala or regex.

Upvotes: 0

Views: 556

Answers (1)

hlagos
hlagos

Reputation: 7947

Not being a regex expert, I am sure there is a better way but this does the work

select regexp_replace(regexp_extract("C:\\\Windows\\\\System32\\\\svchost.exe", ".+(\\\\.+)$", 1), "\\\\", "");

Upvotes: 1

Related Questions