user1263513
user1263513

Reputation: 91

What's wrong with this RegExp?

I believe I did nothing wrong with this:

sed -e "s_//[01]\.thumbs\.4chan\.org/[a-z0-9]\+/src/\([0-9]\*\)s\.jpg_/${LOC}/\1s.jpg_g" -e "s_//images\.4chan\.org/[a-z0-9]\+/src/\([0-9]\*\)\.\(jpg\|gif\|png\)_/${LOC}/\1.\2_g" $LOC.html > a

Can someone tell me why it doesn't convert online links to offline links?

Upvotes: 2

Views: 126

Answers (2)

Peter.O
Peter.O

Reputation: 6856

Using sed in simple regex mode, as you are, the + must be escaped to \+, as you have done, and as "expected" the asterisk * needs to be asis, ie. not escaped. here are a couple of tests, using \+ and *

If you want to simplify things use of sed in extended regex mode, by using the -r option. You won't need to escape +, (, )....

echo '//0.thumbs.4chan.org/abc123/src/029s.jpg' |
    sed -n "\_//[01]\.thumbs\.4chan\.org/[a-z0-9]\+/src/\([0-9]*\)s\.jpg_p"

echo '//images.4chan.org/abc123/src/029.jpg' |
    sed -n "\_//images\.4chan\.org/[a-z0-9]\+/src/\([0-9]*\)\.\(jpg\|gif\|png\)_p"

output:

//0.thumbs.4chan.org/abc123/src/029s.jpg
//images.4chan.org/abc123/src/029.jpg

Upvotes: 1

wallyk
wallyk

Reputation: 57774

I think \* and \+ should be * and +? Otherwise it looks literally for * and +.

Upvotes: 2

Related Questions