Sukhbir
Sukhbir

Reputation: 671

regex match is not as i expect

In the below given Matches: Why first match is giving that output... and in second match why '-' is not matched with the target??

Index: 01234567890123456789012345678
Target: whois who? Who me? Who else?
Pattern: (^[a-z])|(\?$)
Match: (0,0:w)(28,28:?)

Index: 01234567890
Target: \-^$.?*+()|
Pattern: [\\-^$.?*+()|]
Match: (0,0:\)(2,2:^)(3,3:$)(4,4:.)(5,5:?)(6,6:*)(7,7:+)(8,8:()(9,9:))(10,10:|)

Edit:-

Thanks for asking the code

Please find the code here : http://paste.ubuntu.com/11831819/

Upvotes: 0

Views: 79

Answers (1)

ikrabbe
ikrabbe

Reputation: 1929

The first one matches any character at the start of the line/string ^[a-z] and the question mark at the end of the line/string \?$, that is because of

  • ^ means start of the line
  • $ means end of the line

In the second one, the [] means to match characters in the set, and the - inside that means "between", so match characters whose ascii value is between \ (having ascii value 92) and ^ (having ascii value 94), or one of $.*+()|. Since the ascii value of - is 46, it will not be displayed.

To solve your problem you should quote the -

[\\\-^$.?*+()|]

or put it at the end

[\\^$.?*+()|-]

. Of course this is bash but:

echo 'begin []\^$.?*+()|- end' | sed -e 's/[][\\^$.?*+()|-]/x/g'
begin xxxxxxxxxxxxx end

All special characters have been replaced by x while I only quoted \, because all other chars are placed right. If I move the - or the [ or the ] I have to quote them too.

Upvotes: 4

Related Questions