JonaK
JonaK

Reputation: 129

Print text between two strings on the same line

I've been searching for a ling time, and have not been able to find a working answer for my problem.

I have a line from an HTML file extracted with sed '162!d' skinlist.html, which contains the text

<a href="/skin/dwarf-red-beard-734/" title="Dwarf Red Beard">.

I want to extract the text Dwarf Red Beard, but that text is modular (can be changed), so I would like to extract the text between title=" and ".

I cannot, for the life of me, figure out how to do this.

Upvotes: 2

Views: 2302

Answers (5)

Endoro
Endoro

Reputation: 37569

also sed

sed -n '162 s/.*"\([a-zA-Z ]*\)"./\1/p' skinlist.html

Upvotes: 0

koola
koola

Reputation: 1734

Solution in sed

sed -n '162 s/^.*title="\(.*\)".*$/\1/p' skinlist.html

Extracts line 162 in skinlist.html and captures the title attributes contents in\1.

Upvotes: 1

abasu
abasu

Reputation: 2524

You can pass it through another sed or add expressions to that sed like -e 's/.*title="//g' -e 's/">.*$//g'

Upvotes: 0

Zombo
Zombo

Reputation: 1

awk 'NR==162 {print $4}' FS='"' skinlist.html
  • set field separator to "
  • print only line 162
  • print field 4

Upvotes: 2

Gordon Davisson
Gordon Davisson

Reputation: 125788

The shell's variable expansion syntax allows you to trim prefixes and suffixes from a string:

line="$(sed '162!d' skinlist.html)"   # extract the relevant line from the file
temp="${line#* title=\"}"    # remove from the beginning through the first match of ' title="'
if [ "$temp" = "$line" ]; then
    echo "title not found in '$line'" >&2
else
    title="${temp%%\"*}"   # remote from the first '"' through the end
fi

Upvotes: 0

Related Questions