ilrein
ilrein

Reputation: 3923

Finding a substring if a condition is met

I have access to a large amount of HTML inside a single string:

const { body_html } = this.props.product_page;

I am attempting to update this HTML using only string parsing. Specifically, I want to find the first div closing element after a specific substring is found:

product.description.

The challenge is the dynamic nature of product_page. There will be an unknown amount of characters between the first closing div </div> and the end of the substring product.description

How can I inject <div>Hello, world!</div> after the first closing div -- after finding the product description variable?

EDIT: I know it's poor practise to modify HTML in such a fashion, but due to technical constraints, these are the conditions I have to satisfy. Also, this is not pure HTML code, but liquid code actually (embedded Ruby templates). Finally, I never asked for regex specifically. Can't indexOf with substrings be enough (or is that technically the same thing)?

Upvotes: 0

Views: 286

Answers (3)

T.J. Crowder
T.J. Crowder

Reputation: 1075755

Obligatory link. By far, the best way to do this is to parse the HTML properly: With an HTML parser. There's one built into the browser, after all. If you try to do this with simplistic string processing, the odds are it will bite you.

Can't indexOf with substrings be enough (or is that technically the same thing)?

Not quite. Officially, end tag for a div could be </div> or </div > (where that space could be any number of whitespace, including newlines, tabs, etc.). In practice, browsers tolerate whitespace between the / and div as well.

So you'll want a regular expression to find the end tag. Something like:

var str = "testing product.description }}\n</div\n\t >";
var match = /(product\.description[\s\S]*?)<\/\s*div\s*>/.exec(str);
console.log("Original string: " + str);
if (match) {
  var index = match.index + match[1].length;
  console.log("It's at index " + index);
  str = str.substring(0, index) +
        "<div>Hello, world!</div>" +
        str.substring(index);
  console.log("New string: " + str);
} else {
  console.log("Not found");
}
.as-console-wrapper {
  max-height: 100% !important;
}

That regex allows for whitespace in the closing </div> tag and gives you the length of the part of the match prior to it, so you can insert the string.

One slightly tricky bit of that is the [\s\S]*? part, which is basically .*? (optionally match any number of any characters) but it includes newlines, which . doesn't. ([\s\S] means "any whitespace or non-whitespace character.)

Upvotes: 1

Strikers
Strikers

Reputation: 4776

Here is a small idea for you to work.

lets say you have the following html string

var myStr = "<div><span>Hello<span><div>SearchString<div>Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.</div></div></div>";

and your search string is

var searchString = "SearchString";

Now first find the index of this string using myStr.indexOf() and use it to substring till the end and find the nearest occurrence of ''

myStr.substr(myStr.indexOf(searchString),myStr.length).indexOf('</div>')

now you have the index where you have to insert your string. insert it there and you are good to go

here is a jsfiddle for you

Upvotes: 0

Akhil
Akhil

Reputation: 2602

First of all, this is not a good practice. You should definitely try with HTML parser.

But just to answer your question, below is the sample code for the same

var mySearchStr = "<div> Test String 1 </div><div> myString is this </div><div> New string should be before this </div>";

var searchStrIndex = mySearchStr.indexOf("myString");

var closingDivIndex = mySearchStr.indexOf("</div>", searchStrIndex + 1); // Div after the first occurence of search string

var firstPart = mySearchStr.substring(0, closingDivIndex + 6);  // 6 is the length of </div>

var secondPart = mySearchStr.substring(closingDivIndex + 6);

var finalString = firstPart + "<div> My New content </div>" + secondPart;

alert(finalString);

There may be better ways out there using regex. But I am not an expert there.

Plunker

Upvotes: 0

Related Questions