Yury Pogrebnyak
Yury Pogrebnyak

Reputation: 4173

DOMstring parser

I have a DOMstring object, text of some web page which I get from server using XMLHttpRequest. I need to cut a substring from it, which lies between some specific tags. Is there any easy way to do this? Such methods as substring() or slice() won't work in my case, because content of the web page is dynamic, so I can't specify the beginning and the end of substring (I only know that it's surrounded by <tag> and </tag>).

Upvotes: 0

Views: 4940

Answers (4)

maerics
maerics

Reputation: 156384

A DOMString is just implemented as a string in most (all?) JavaScript browser environments so you can use any parsing technique you like, including regular expressions, DOMParser, and the HTML parser provided by libraries such as jQuery. For example:

function extractText(domString) {
  var m = (''+domString).match(/<tag>(.*?)<\/tag>/i);
  return (m) ? m[0] : null;
}

Of course, this is a terrible idea; you should really use a DOM parser, for example, with jQuery:

$('tag', htmlString).html();

[Edit] To clarify the above jQuery example, it's the equivalent of doing something like below:

function extractText2(tagName, htmlString) {
  var div = document.createElement('div'); // Build a DOM element.
  div.innerHTML = htmlString; // Set its contents to the HTML string.
  var el = div.getElementsByTagName(tagName) // Find the target tag.
  return (el.length > 0) ? el[0].textContent : null; // Return its contents.
}
extractText2('tag', '<tag>Foo</tag>'); // => "Foo"
extractText2('x', '<x><y>Bar</y></x>'); // => "Bar"
extractText2('y', '<x><y>Bar</y></x>'); // => "Bar"

This solution is better than a regex solution since it will handle any HTML syntax nuances on which the regex solution would fail. Of course, it likely needs some cross-browser testing, hence the recommendation to a library like jQuery (or Prototype, ExtJS, etc).

Upvotes: 1

Davsket
Davsket

Reputation: 1308

As @Gus but improved, if you only have text and the tags are repited:

"<tag>asd</tag>".match(/<tag>[^<]+<\/tag>/);

Upvotes: 0

Gus
Gus

Reputation: 6871

Assuming the surrounding tag is unique in the string...

domString.match(/.*<tag>(.*)<\/tag>.*/)[0] 

or

/.*<tag>(.*)<\/tag>.*/.exec(domString)[0]

Seems like it should do the trick

Upvotes: 0

Jon
Jon

Reputation: 230

yourString.subtring(yourString.indexOf('<tag>') + 5, yourString.indexOf('</tag>'));

This should work, assuming you know the name of the surrounding tags.

Upvotes: 2

Related Questions