Reputation: 571
In the following text, I want to extract the keys with their values.
I've written the following regex but it does not matches the values across multiple lines. regex: --(.*)=.*(?=(.|--|\n|\Z)*)
--some text here not to be matched
--key1=this is a
multiline statement
statement
--random text not to be matched
--key2=val2
--key3=val3
--random text here not to be matched
So, after matching the output should be
--key1=this is a
multiline statement
statement
--key2=val2
--key3=val3
Upvotes: 1
Views: 101
Reputation: 28303
Perhaps the OP provided a simplistic example and in actual code, regex will be required, but the example above can be filtered without regex
The central insight in this method of filtering out the junk lines is to remove all lines that start with --
but doesn't contain =
.
text = """--some text here not to be matched
--key1=this is a
multiline statement
statement
--random text not to be matched
--key2=val2
--key3=val3
--random text here not to be matched"""
valid_lines = [l for l in text.split('\n') if not (l.startswith('--') and '=' not in l)]
result = '\n'.join(valid_lines)
print(result)
# output
--key1=this is a
multiline statement
statement
--key2=val2
--key3=val3
to construct a dictionary out of the result text:
mydata = {data.split('=')[0]:data.split('=')[1].strip('\n') for data in result.strip('-').split('--')}
print(mydata)
#outputs:
{'key1': 'this is a\n multiline statement\n statement', 'key2': 'val2', 'key3': 'val3'}
Upvotes: 0
Reputation: 2497
Ajax's answer will fail if any of the values contain -
. Instead, do a negative lookaround to ensure that the vals do not contain --
.
This regex will do that: --.+=((?!--)[\S\s])+
Upvotes: 1
Reputation: 71461
You can try this:
import re
s = """
--some text here not to be matched
--key1=this is a
multiline statement
statement
--random text not to be matched
--key2=val2
--key3=val3
--random text here not to be matched
"""
new_data = re.findall('\-\-\w+\=[a-zA-Z\s\n]+', s)
for i in new_data:
print(i)
Output:
--key1=this is a
multiline statement
statement
--key2=val
--key3=val
Upvotes: 2