Reputation: 4208
I have a string in the form of:
integer
, integer
, a comma separated list of strings, integer
for example:
"0, 0, ['REFERENCED', 'UPTODATE', 'LRU'], 1"
I want to return this substring ['REFERENCED', 'UPTODATE', 'LRU']
I thought of using split(", ")
and then joining things together but it will just be so complicated. How to do that with regex?
Upvotes: 0
Views: 319
Reputation: 12239
There is no need for a regex. Wrap your string in brackets to make a string representation of a list, then use ast.literal_eval
to turn it into an actual list.
import ast
s = "0, 0, ['REFERENCED', 'UPTODATE', 'LRU'], 1"
outer_list = ast.literal_eval('[' + s + ']')
inner_list = outer_list[2]
print(inner_list)
You may be tempted to use eval
instead of ast.literal_eval
. Resist the temptation. Using eval
is unsafe because it will evaluate any Python expression, even if it contains nasty stuff such as instructions to delete files from your hard drive. You can use ast.literal_eval
without fear because it only parses strings, numbers, tuples, lists, dicts, booleans, and None
.
Upvotes: 1
Reputation: 310287
Just write a regular expression to capture a group that consist of a [
, any characters and then a ]
.
>>> import re
>>> s = "0, 0, ['REFERENCED', 'UPTODATE', 'LRU'], 1"
>>> re.search(r'(\[.*\])', s).group(1)
"['REFERENCED', 'UPTODATE', 'LRU']"
If the input really is this well structured, you could use ast.literal_eval
:
>>> import ast
>>> ast.literal_eval(s)[2]
['REFERENCED', 'UPTODATE', 'LRU']
To safely evaluate strings that contain python literals and pull the third element out of the tuple
.
Upvotes: 2
Reputation: 180540
s = "0, 0, ['REFERENCED', 'UPTODATE', 'LRU'], 1"
start = s.find("[")
end = s.rfind("]")
print(s[start:end+1])
['REFERENCED', 'UPTODATE', 'LRU']
Upvotes: 1
Reputation: 17647
If you're just looking for an expression, try something like:
"\[([\w\d,']+)\]"
Upvotes: 0