Michael Grassman
Michael Grassman

Reputation: 1935

Match words with spaces after multiple prefixes

I have the following string

D_Doc Name L_Linked Doc Q_1_5

or

D_Doc Name L_Linked Doc Q_5

I'm having a hard time creating a regex to match the following

Doc Name
Linked Doc
1_5 or 5

D_Doc Name is always present L_ and Q_ are not

The string may also look like the following

D_Doc Name Doc Q_1_5
D_Doc Name Doc Q_5
D_Doc Name L_Linked Doc

I would like to be able to reference the matches as match['DocName'] or some meaning full way so I know which match is found and which isn't.

Any suggestions?

Upvotes: 0

Views: 98

Answers (3)

svick
svick

Reputation: 244948

If I understand you correctly, the regex you want is something like:

^D_(?<D>.*?)( L_(?<L>.*?))?( Q_(?<Q>.*))?$

It produces the following results for some test inputs:

Input                          D             L           Q 
D_Doc Name L_Linked Doc Q_1_5  Doc Name      Linked Doc  1_5
D_Doc Name Doc Q_1_5           Doc Name Doc              1_5
D_Doc Name Doc Q_5             Doc Name Doc              5
D_Doc Name L_Linked Doc        Doc Name      Linked Doc
D_Doc Name Doc Q_5             Doc Name Doc              5

Upvotes: 1

Scott Rippey
Scott Rippey

Reputation: 15810

Your requirements are a little tricky to decipher, but I think this will do it:

D_(\w+) (\w+) (L_(\w+) )?(\w+)( (Q_)?(\w+))?

and if you want to add "Named Groups" (with what I assume are appropriate names):

D_(?<Doc>\w+) (?<DocName>\w+) (L_(?<Linked>\w+) )?(?<LinkedDoc>\w+)( (Q_)?(?<Q>\S+))?

Upvotes: 0

Salvatore Previti
Salvatore Previti

Reputation: 9070

Maybe regex are a bit too much for this problem, I would use a simple string.Split(s, ' ') and then i'll analyze words one by one, maybe with regex for the last word. Also the last word is easily splittable however. I guess it would be simpler to write your code just working on an array.

Upvotes: 0

Related Questions