Ian
Ian

Reputation: 5725

REGEX: Extract paths from string

I would like to extract paths in the form of:

$/Server/First Level Folder/Second_Level_Folder/My File.extension

The challenge here is that the paths are embedded in a "free form" email like so:

Hello,

 You can download the file here:
  • $/Server/First Level Folder/Second_Level_Folder/My File.extension <- Click me!

Given a string, I would like to extract all paths from it using RegEx. Is this even possible?

Thanks!

Upvotes: 4

Views: 28034

Answers (2)

Santrix
Santrix

Reputation: 935

If the filename contains [escaped forward slashes / or no period symbol] AND the filepath spaces are escaped with a backslash '\ ' you can still do it with this (i've escaped the forward and back slashes)

(\/.*?\/)((?:[^\/]|\\\/)+?)(?:(?<!\\)\s|$)

Regular expression visualization

Debuggex Demo

This creates two capture groups - one for the path and one for the file basename. If your test strings contains filenames with unescaped spaces (as shown) then you would have to use the period in the filename as an anchor as per B8vrede's answer.

Upvotes: 4

B8vrede
B8vrede

Reputation: 4532

Yes, this is possible (\$/.*?\.\S*) should do the job just fine.

\$/ matches the start of the path

.*? matches everything till the next part of the regex

\.\S* matches the dot and anything but a whitespace (space, tab)

And the ( ) around it make it capture all that is matched.

EDIT:

For further use

Just the path

(\$/.*?/)[^/]*?\.\S*

Just the filename

\$/.*?/([^/]*?\.\S*)

Upvotes: 12

Related Questions