Reputation: 505
As an example of the type of content I have to parse off of a ticket:
Name:
snakeoil
Host:
foobar
{block}
email: some data here
url: http://foo
date: 01/02/16
{block}
I can identify the 'key', which is any word typically ending in a colon
I could use the regex module to do a match like ^\w$
to extract the key, but I must handle both the case where the value is in the same line vs in the subsequent line.
Having to fetch the word in the next line is what I can't think of how to address cleanly and/or effectively.
Upvotes: 0
Views: 113
Reputation: 10680
If You need email
, url
and date
too:
>>> re.findall('\s*(.*?):[\n\s]?(.*)$', s, re.MULTILINE)
[('Name', 'snakeoil'), ('Host', 'foobar'), ('email', 'some data here'), ('url', 'http://foo'), ('date', '01/02/16')]
if not, @QiangJin solution is good
Upvotes: 1
Reputation: 4467
You can still use regex if it's well formed,
>>> re.findall('(.*?):\n(.*)$', content, re.MULTILINE)
[('Name', 'snakeoil'), ('Host', 'foobar')]
Upvotes: 2