fro_oo
fro_oo

Reputation: 1610

Regular expression to match some conditions given a formatted file name?

(Sorry for the bad title, any suggestion appreciated) ;-)

Well, consider those strings:

first = "SC/SCO_160ZA206_T_mlaz_kdiz_nziizjeij.ext"
second = "MLA/SA2_jkj15PO_B_lkazkl lakzlk-akzl.oxt"
third = "A12A/AZD_KZALKZL_F_LKAZ_AZ__azaz___.ixt"

I'm looking for a regular expression allowing me to get arrays like this (in ruby):

first_array = ['SCO', '160ZA206', 'T', 'mlaz_kdiz_nziizjeij']
second_array = ['SA2', 'jkj15PO', 'B', 'lkazkl lakzlk-akzl']
third_array = ['AZD', 'KZALKZL', 'F', 'LKAZ_AZ__azaz___']

The first match must be anything right after the / and before the first _

The second match must be anything between the first and the second _

The third match must be anything between the second and the third _

The last match must be anything between the third _ and the last .

I can't get it: [^\/].?([A-Z]*)_(.*)_(.*)[\.$] :-(

Upvotes: 3

Views: 203

Answers (2)

steenslag
steenslag

Reputation: 80065

Following up on @fge's split suggestion:

str = "SC/SCO_160ZA206_T_mlaz_kdiz_nziizjeij.ext"
p str[(str.index('/')+1)...str.rindex('.')].split( '_', 4)
#=> ["SCO", "160ZA206", "T", "mlaz_kdiz_nziizjeij"]

It splits on _ for max 4 elements (the fourth element is the remainder).

Upvotes: 1

Dylan Markow
Dylan Markow

Reputation: 124419

You're super close. Just add a question mark to the second matcher to make it lazy (otherwise, it won't stop at the first underscore), and then duplicate that matcher.

[^\/].?([A-Z]*)_(.*?)_(.*?)_(.*)[\.$]

Upvotes: 6

Related Questions