daisy
daisy

Reputation: 23501

Why isn't this regex working in javascript (works in perl)

I'm trying to match the following string,

Apache/2.2.6 (Unix) mod_ssl/2.2.6 OpenSSL/0.9.7a DAV/2 PHP/5.2.13

with the following regex,

(?:Apache.(\w+))?.*(?:OpenSSL.([\w.]+))?.*(?:PHP.([\w.]+))?

Since the XXX/version pairs may not exist, so I added a ? after each non-capture matching groups

But only the first version string is matched,

var re = /(?:Apache.([\w.]+))?.*(?:OpenSSL.([\w.]+))?.*(?:PHP.([\w.]+))?/;
var captures = re.exec('Apache/2.2.6 (Unix) mod_ssl/2.2.6 OpenSSL/0.9.7a DAV/2 PHP/5.2.13');
console.log (captures);

// captures[1] = '2.2.6'
// captures[2] = undefined
// captures[3] = undefined

Any ideas? Removing the '?' works and I don't know why (works with perl)

EDIT

A valid regex that "works" could be (without the '?'),

(?:Apache.(\w+)).*(?:OpenSSL.([\w.]+)).*(?:PHP.([\w.]+))

Upvotes: 0

Views: 42

Answers (1)

dtanders
dtanders

Reputation: 1845

.* is greedy so I'm not surprised it's not capturing beyond the first group. What surprises me is that it worked at all anywhere. Make those catch-alls lazy and un-group the non-capturing and capturing groups and it seems to work:

(?:Apache.)(\w+)?.*?(?:OpenSSL.)([\w.]+)?.*?(?:PHP.)([\w.]+)?

Upvotes: 2

Related Questions