nemesifier
nemesifier

Reputation: 8529

RegExp: Split text by delimiter

In Python, what's the best way to split a text like the following so that I can have 3 lists that represent each package (system, network and another_package)?

package system

config system 'system'
    option hostname 'test-system'

package network

config interface 'lan'
    option ifname 'eth0'
    option proto 'none'

package another_package

config etc 'etc'
    option name 'val'

Eg (very ugly):

re.split('package ', text)

Would it be possible to capture the package name too?

EDIT - maybe I figured it out:

re.split('(package\s\w*)', text)

Upvotes: 0

Views: 61

Answers (1)

jez
jez

Reputation: 15349

Your "very ugly" re.split already does it as far as I can tell. One possible tweak would be to make the pattern r'^\s*package ' and add the multi-line flag re.M. That would ensure that it only matches "package" as the first word on a line.

This captures all the non-blank sections:

[section.strip() for section in re.split('^\s*package ', text, flags=re.M) if section.strip()]

...and the first word in each section is the package name.

Upvotes: 2

Related Questions