Reputation: 97
I used this regex(\/.*\.[\w:]+)
to find all file paths and directories. But in a line like this "file path /log/file.txt some lines /log/var/file2.txt"
which contains two paths in the same line , it does not select the paths individually , rather , it selects the whole line. How to solve this?
Upvotes: 6
Views: 33424
Reputation: 31
You can use python re
something like this:
import re
msg="file path /log/file.txt some lines /log/var/file2.txt"
matches = re.findall("(/[a-zA-Z\./]*[\s]?)", msg)
print(matches)
Ref: https://docs.python.org/2/library/re.html#finding-all-adverbs
Upvotes: 3
Reputation: 163342
Your regex (\/.*\.[\w:]+)
uses .*
which is greedy and would match [\w:]+
after the last dot in file2.txt
. You could use .*?
instead.
But it would also match /log////var////.txt
As an alternative you might use a repeating non greedy pattern that would match the directory structure (?:/[^/]+)+?
followed by a part that matches the filename /\w+\.\w+
import re
s = "file path /log/file.txt some lines /log/var/file2.txt or /log////var////.txt"
print(re.findall(r'(?:/[^/]+)+?/\w+\.\w+', s))
That would result in:
['/log/file.txt', '/log/var/file2.txt']
Upvotes: 5
Reputation: 1573
Use regex(\/.*?\.[\w:]+)
to make regex non-greedy. If you want to find multiple matches in the same line, you can use re.findall().
Update: Using this code and the example provided, I get:
import re
re.findall(r'(\/.*?\.[\w:]+)', "file path /log/file.txt some lines /log/var/file2.txt")
['/log/file.txt', '/log/var/file2.txt']
Upvotes: 8