Reputation: 489
How can I use sed to get the SOMETHING in <version.suffix>SOMETHING</version.suffix>
?
I tried sed 's#.*>\(.*\)\<version\.suffix\>#\1#'
,but fails.
Upvotes: 2
Views: 7175
Reputation: 2374
Assuming the formatting of the question is accurate, when I run the example in the question as-is:
$ echo '<version.suffix>SOMETHING</version.suffix>' | sed 's#.*>\(.*\)\<version\.suffix\>#\1#'
I see the following output:
SOMETHING</>
In case my formatting skills fail me, this output ends with the trailing left angle bracket, a forward slash, and finally the right angle bracket.
So, why this "failure"? Well, on my system (Linux with GNU grep 2.14), grep(1)
includes the following snippet:
The Backslash Character and Special Expressions
The symbols \< and \> respectively match the empty string at the beginning and end of a word.
Other answers suggest good alternatives to extract the value in XML tag syntax; use them.
I just wanted to point out why the RE in the original problem fails on current Linux systems: some symbols match no actual characters, but instead match empty boundaries in these apps that support posix-extended regular expressions. So, in this example, the brackets in the source are matched in unexpected ways:
(.*)
has matched SOMETHING</
, to be printed by the \1
back-referenceversion.suffix
is matched by \<
version.suffix
is matched by version\.suffix
version.suffix
is matched by \>
>
character remains in sed
's pattern space and is printed.TL;DR -"\X"
does not mean "just match an X" for all X!
Upvotes: 1
Reputation: 63892
Many ways possible, e.g:
with sed
echo '<version.suffix>SOMETHING</version.suffix>' | sed 's#<[^>]*>##g'
or grep
echo '<version.suffix>SOMETHING</version.suffix>' | grep -oP '<version.suffix>\KSOMETHING(?=</version.suffix>)'
Upvotes: 1
Reputation: 3860
Try this one:
sed 's/<.*>\(.*\)<.*>/\1/'
It should be general enough to get every xml value.
If you need to eliminate the indentation add \s*
at the beginning like this:
sed 's/\s*<.*>\(.*\)<.*>/\1/'
Alternatively if you only want version.suffix
's value, you can make the command more specific like this:
sed 's/<version\.suffix>\(.*\)<.*>/\1/'
Upvotes: 3
Reputation: 174696
You could use the below sed command,
$ echo '<version.suffix>SOMETHING</version.suffix>' | sed 's#^<[^>]*>\(.*\)<\/[^>]*>$#\1#'
SOMETHING
^<[^>]*>
Matches the first tag string <version.suffix>
.\(.*\)<\/[^>]*>$
Characters upto the next closing tag are captured. And the remaining closing tag was matched by this <\/[^>]*>
regex.Your regex is correct but the only thing is, you forget to use /
inside the closing tag.
$ echo '<version.suffix>SOMETHING</version.suffix>' | sed 's#.*>\(.*\)</version\.suffix>#\1#'
|<-Here
SOMETHING
Upvotes: 1