Reputation: 771
I have the following string:
'FIELDS--> FIELD1: Random Sentence \r\n FIELD2: \r\nSOURCEHINT--> FIELD3:
value.nested.value, FIELD4: 5.5.5.5, FIELD5: Longer Sentence, with more words-and punctation\r\n'
I want the following from the string above:
[FIELD1, Random Sentence]
[FIELD2, ]
[FIELD3, value.nested.value]
[FIELD4, 5.5.5.5]
[FIELD5, Longer Sentence, with more words-and punctation]
I still want the value if it is empty and I want the full sentences. The amount of fields may vary as well. This is similar to Match word before and after colon, but in this case I want the full sentence instead of just the word. Additionally the FIELD names can change. So they could KEY3, instead of FIELD1.
I tried:
re.findall(r'(\w+) *:(?:(.*)?), x)
It stops matching after the first match, so this just outputs FIELD1, and matches everything after it.
Upvotes: 1
Views: 111
Reputation: 627103
It seems you may use
r'(\w+) *: *(.*?)(?=\s*(?:\w+:|$))'
See the regex demo
Details
(\w+)
- Group 1: one or more word chars *: *
- a :
enclosed with spaces(.*?)
- Group 2: any chars, 0 or more repetitions, as few as possible, up to the first occurrence of(?=\s*(?:\w+:|$))
- 0+ whitespaces followed with either 1+ word chars followed with :
or an end of the string position.Upvotes: 1