user3320592
user3320592

Reputation: 13

Can't get correct regex pattern to parse song info?

I have the following line to process:

...playlist index:109 id:38522 title:Christmas in Heaven artist:B.B. King album:A Christmas Celebration of Hope playlist index:110 id:38523 title:I'll Be Home for Christmas artist:B.B. King album:A Christmas Celebration of Hope playlist index:111 id:38524 title:To Someone That I Love artist:B.B. King album:A Christmas Celebration of Hope playlist index:112 id:38525 title:Christmas Celebration artist:B.B. King album:A Christmas Celebration of Hope playlist index:113 id:38526 title:Merry Christmas, Baby artist:B.B. King album:A Christmas Celebration of Hope

The best pattern I have so far is:

playlist index:(?<index>\d+) id:(?<id>\d+) title:(?<title>[\w\s',]+) artist:(?<artist>[\w\s'.]+) album:(?<album>[\w\s']+)

but, it only matches every other one because playlist (of playlist index) is considered part of the previous album name.

Upvotes: 0

Views: 59

Answers (2)

Bob Vale
Bob Vale

Reputation: 18474

Simplest fix

playlist index:(?<index>\d+) id:(?<id>\d+) title:(?<title>[\w\s',]+) artist:(?<artist>[\w\s'.]+) album:(?<album>[\w\s']+?)(?=$|\splaylist)

Upvotes: -1

Jerry
Jerry

Reputation: 71548

You can make use of a positive lookahead to limit the number of characters the album part takes:

playlist index:(?<index>\d+) id:(?<id>\d+) title:(?<title>[\w\s',]+) artist:(?<artist>[\w\s'.]+) album:(?<album>[\w\s']+)(?= playlist index:|$)
                                                                                                                         ^^^^^^^^^^^^^^^^^^^^^^

regex101 demo

It basically makes sure that when the match ends, there's either playlist index: right after or the end of the line with $.

Upvotes: 2

Related Questions