Decula
Decula

Reputation: 508

Lua text parsing, space handling

I'm a newbie to Lua. And I want to parse the text like

Phase1:A B Phase2:A B Phase3:W O R D Phase4:WORD

to

Phase1         Phase2      Phase3     Phase4

A              A B         W O R D    WORD

I used string.gmatch(s, "(%w+):(%w+)"), I can only get

Phase1     Phase2     Phase3       Phase4

A          A          W            WORD

How can I get missing B, O, R, D back?
Or do I need to write pattern for every phases? How to do that?

Upvotes: 3

Views: 778

Answers (2)

Decula
Decula

Reputation: 508

for k, v in s:gsub('%s*(%w+:)','\0%1'):gmatch'%z(%w+):(%Z*)'

– @Egor Skriptunoff
This pattern works better.

Upvotes: 0

greatwolf
greatwolf

Reputation: 20888

The input text in your example doesn't have any clear delimiter between the phrases so parsing it accurately with regex is tricky.

This would be much easier to parse if you add a delimiter symbol like a , to separate the phrases.

Phrase1:A B, Phrase2:A B, Phrase3:W O R D,Phrase4:WORD

You can then parse it with this pattern:

s = "Phrase1:A B, Phrase2:A B, Phrase3:W O R D,Phrase4:WORD"

for k, v in s:gmatch "(Phrase%d+):([^,]+)" do
    print(k, v)
end

outputs:

Phrase1 A B
Phrase2 A B
Phrase3 W O R D
Phrase4 WORD

If it's not possible to relax the above constraint, you can try this pattern:

  s:gmatch "Phrase%d+:%w[%w ]* "

Note there's a caveat with this pattern, the string you're parsing needs to have an extra space at the end or the last phrase won't get parsed.

Upvotes: 4

Related Questions