Reputation: 110422
I am trying to extract the filename from the following string:
s = '[download] /tmp/743979_file.mp4 has already been downloaded'
Here is what I have so far:
>>> re.search(r'(\s).+_file[^\s]+', s).group()
' /tmp/743979_file.mp4'
How would I get everything after the first space, before the second space, and which includes the word _file
?
Upvotes: 0
Views: 3650
Reputation: 174816
Use \S
to match any non-space character. So \S*
would match zero or more non-space characters. \s
is just an opposite of \S
where \s
matches any kind of space character and \S
matches any kind of non-space character.
>>> s = '[download] /tmp/743979_file.mp4 has already been downloaded'
>>> re.search(r'(?<=\s)\S*_file\S*', s).group()
'/tmp/743979_file.mp4'
OR
simply,
>>> re.search(r'\S*_file\S*', s).group()
'/tmp/743979_file.mp4'
OR
>>> s = '[download] /tmp/743979_file.mp4 has already been downloaded'
>>> m = s.split()[1]
>>> if '_file' in m:
print(m)
/tmp/743979_file.mp4
Upvotes: 4
Reputation: 2140
Another simple solution could be using split
:
print '[download] /tmp/743979_file.mp4 has already been downloaded'.split()[1]
Upvotes: 2