tokhi
tokhi

Reputation: 21618

Capture string between specific characters

Can someone help me extract the string:

Advice about something

from below:

<TITLE>Advice about something</TITLE>

The expression should be able to capture the string between <TITLE> and </TITLE>. I tried expressions such as [^TITLE<g\/], but couldn't get the right output.

Upvotes: 2

Views: 80

Answers (3)

sawa
sawa

Reputation: 168081

If you want a robust solution rather than a temporal hack, then use specific parsers.

require "cgi"
require "nokogiri"
Nokogiri.parse(CGI.unescapeHTML(
  "<TITLE>Advice about something</TITLE>"
))
.xpath("TITLE").text
# => "Advice about something"

Upvotes: 5

user1679749
user1679749

Reputation: 9

Depends. Is the string always delimited by semi-columns?

tmp = "<TITLE>Advice about something</TITLE>"
=> "<TITLE>Advice about something</TITLE>" 

tmp.split(';')[2].gsub(/\&lt/, "")
=> "Advice about something"

Upvotes: 0

HamZa
HamZa

Reputation: 14921

Take the left part <TITLE> and the right part </TITLE> and put (.*?) in between:
<TITLE>(.*?)<\/TITLE>

Online demo

Upvotes: 1

Related Questions