Tux
Tux

Reputation: 1843

SED RegEx Between Multiple Same Characters

How would I grab this following title text between all this tags and symbols ?

What I need to grab:

Some Title Here v1.2.3 Some Other Description About the Title in Here

Example source code:

<body><pre>=============================================================
Some Title Here v1.2.3 Some Other Description About the Title in Here
=============================================================

some other data here but I don't care about it ...

</pre></body></html>

I've tried doing this, but it grabs whole top part too before pre tag even, but bellow part seems to work fine, except that it also grabs = symbols.

sed -n '/<pre>=/,/=/p

Result from this above sed code is:

<body><pre>=============================================================
Some Title Here v1.2.3 Some Other Description About the Title in Here
=============================================================

Any feedback about this would be apreceated. Thank you so much, and as always StackOverflow is the best community for Q's and A's =)

Upvotes: 2

Views: 217

Answers (3)

Guru
Guru

Reputation: 16994

Updating OP's solution:

$ sed -n '/<pre>=/,/=/{/=$/d;p;}' file 
Some Title Here v1.2.3 Some Other Description About the Title in Here

From the range of lines selected, delete those ending with =, so you are left with the line in-between.

Upvotes: 0

potong
potong

Reputation: 58438

This might work for you (GNU sed):

sed '/^<body><pre>=\+$/,/^=\+$/!d;//d' file

Upvotes: 0

Steve
Steve

Reputation: 54422

One way using GNU sed:

sed -n '/<pre>=/,/=/ { //!p }' file.txt

Result:

Some Title Here v1.2.3 Some Other Description About the Title in Here

Explanation:

//!p simply tells sed to ignore the last match.

Upvotes: 3

Related Questions