CaseyJones
CaseyJones

Reputation: 505

how to parse multi-line in a clean way using python?

As an example of the type of content I have to parse off of a ticket:

Name:
snakeoil
Host:
foobar

{block}
  email: some data here
  url: http://foo
  date: 01/02/16
{block}

I can identify the 'key', which is any word typically ending in a colon

I could use the regex module to do a match like ^\w$ to extract the key, but I must handle both the case where the value is in the same line vs in the subsequent line.

Having to fetch the word in the next line is what I can't think of how to address cleanly and/or effectively.

Upvotes: 0

Views: 113

Answers (2)

Tomasz Jakub Rup
Tomasz Jakub Rup

Reputation: 10680

If You need email, url and date too:

>>> re.findall('\s*(.*?):[\n\s]?(.*)$', s, re.MULTILINE)
[('Name', 'snakeoil'), ('Host', 'foobar'), ('email', 'some data here'), ('url', 'http://foo'), ('date', '01/02/16')]

if not, @QiangJin solution is good

Upvotes: 1

Qiang Jin
Qiang Jin

Reputation: 4467

You can still use regex if it's well formed,

>>> re.findall('(.*?):\n(.*)$', content, re.MULTILINE)
[('Name', 'snakeoil'), ('Host', 'foobar')]

Upvotes: 2

Related Questions