how to parse multi-line in a clean way using python?

Question

As an example of the type of content I have to parse off of a ticket:

Name:
snakeoil
Host:
foobar

{block}
  email: some data here
  url: http://foo
  date: 01/02/16
{block}

I can identify the 'key', which is any word typically ending in a colon

I could use the regex module to do a match like ^\w$ to extract the key, but I must handle both the case where the value is in the same line vs in the subsequent line.

Having to fetch the word in the next line is what I can't think of how to address cleanly and/or effectively.

Tomasz Jakub Rup · Accepted Answer

If You need email, url and date too:

>>> re.findall('\s*(.*?):[
\s]?(.*)$', s, re.MULTILINE)
[('Name', 'snakeoil'), ('Host', 'foobar'), ('email', 'some data here'), ('url', 'http://foo'), ('date', '01/02/16')]

if not, @QiangJin solution is good

how to parse multi-line in a clean way using python?

Answers (2)

Related Questions