isf
isf

Reputation: 61

Sed subexpressions not working as expected

I am trying to make a simple wikitext parser using sed/bash. When I run

echo "London has [[public transport]]" | sed s/\\[\\[[A-Za-z0-9\ ]*\\]\\]/link/

it gives me London has link but when I try to use marked subexpressions to get the contents of the brackets using

sed s/\\[\\[\([A-Za-z0-9\ ]*\)\\]\\]/\1/

it just gives me London has [[public transport]]

Upvotes: 2

Views: 691

Answers (2)

shellter
shellter

Reputation: 37278

echo "London has [[public transport]]" | sed 's@[[][[]\([A-Za-z0-9\ ]*\)[]][]]@\1@'

output

London has public transport

works on my machine.

I hope this helps.

Upvotes: 0

mathematical.coffee
mathematical.coffee

Reputation: 56915

That's because the regex doesn't match.

Since you're not surrounding your sed expression in quotes, you have to double-escape slashes for the shell - that's why you have \\[ instead of \[.

Now in sed default regex (basic regular expressions), capturing brackets are denoted by \( and \) in regex. Since you're typing this into the shell without surrounding with quote marks, you need to escape the backslash. And since bash interprets brackets, you have to escape them too:

echo "London has [[public transport]]" | sed s/\\[\\[\\\([A-Za-z0-9\ ]*\\\)\\]\\]/\\1/

I strongly recommend you just enclose your sed expression in single quotes for ease of writing:

echo "London has [[public transport]]" | sed 's/\[\[\([A-Za-z0-9\ ]*\)\]\]/\1/'

Much easier right?

Upvotes: 2

Related Questions