Reputation: 27
What I'm looking for
I have code like this:
<div this-html="text goes here"></div>
After a regex, I want the value of attribute "this-html" be the text between the opening and closing tag, like so:
<div>text goes here</div>
The element can contain other attributes and doesn't have to be a div, it can basically be any other type of element, as long as it uses a closing tag (which doesn't have to be on the same line). It's also possible that the input has text between the tags, like so <div this-html="text goes here">dummy text</div>
, but that can be ignored / should be overwritten with the value of the "this-html" attribute.
What I have
I can't use jQuery or turn the string into a Javascript object, as it may contain PHP (which will then get crippled if you turn it back into a string again). This script is used during a 'publish to html' process of an application, hence it can contain PHP. And so, I'm trying to solve it using regular expressions.
So basically, all I have is Javascript and the HTML I need to work with is just a string, there's no DOM to work with.
Now, I have a regular expression that does this for me, but it doesn't work when you have multiple matches on the same line or when I have another attribute after "this-html".
This is the regex I'm using:
/(<\s*[^<]+?)this-html=['"]{1}(.+)['"]{1}([^>]*>)[\w\W]*?(<\/.+>)/gmi
And I group it back together with $1$3$2$4
.
Now, let's say I have the following input:
<div this-html="text goes here!" class="something">test</div><div this-html="another test">Option is visible on preview/publish</div>
Then my regex pattern will mess this up and I end up with something like this:
<div >text goes here!" class="something">test</div><div this-html="another test</div>
I'm not a regex guru, but I get the feeling this regex could be a whole lot simpler, but I'm stuck here.
Any ideas?
Upvotes: 0
Views: 77
Reputation: 626920
This is the correct way of doing what you want:
function convert(html) {
var div = document.createElement('div');
div.innerHTML = html;
div.querySelectorAll('*').forEach(el => {
var h = el.getAttribute('this-html');
if (h) {
el.innerHTML = h;
el.removeAttribute('this-html');
}
});
return div.innerHTML;
}
var html = '<div this-html="text goes here!" class="something">test</div><div this-html="another test">Option is visible on preview/publish</div>';
console.log( convert(html) );
However, as your environment does not allow you to use DOM, you might resort to regex like
text.replace(/(<\s*(\w+)[^<]*?)\s+this-html=['"]([^"']*)['"]([^>]*?)\s*>[\w\W]*?(<\/\2>)/gi, '$1$4>$3$5')
See the regex demo. NOTE: once it is possible to use DOM, please switch to the solution, and not this workaround.
Details
(<\s*(\w+)[^<]*?)
- Group 1 ($1
value): <
, zero or more whitespaces, Group 2 ($2
value): any one or more word chars, then any zero or more chars other than <
as few as possible\s+
- one or more whitespacethis-html=
- literal text['"]
- a "
or '
([^"']*)
- Group 3 ($3
value): zero or more chars other than "
and '
['"]
- a "
or '
([^>]*?)
- Group 4 ($4
value): any zero or more chars other than >
as few as possible\s*
- zero or more whitespace>
- a >
char[\w\W]*?
- any zero or more chars, as few as possible(<\/\2>)
- Group 5 ($5
value): </
, same value as in Group 2, >
.See the JavaScript demo:
var text = '<div this-html="text goes here!" class="something">test</div><div this-html="another test">Option is visible on preview/publish</div>';
console.log( text.replace(/(<\s*(\w+)[^<]*?)\s+this-html=['"]([^"']*)['"]([^>]*?)\s*>[\w\W]*?(<\/\2>)/gi, '$1$4>$3$5') );
Upvotes: 1
Reputation: 436
I don't know what the PHP code contained in the string will look like, but can something like this be fine? :)
var regex = /<\s*(\w+)([^>]*)\s*this-html=\"([^"]*)\"([^>]*)>[^<]*<\s*\/\s*\w+\s*>/gi;
var stringTest = '<div this-html="text goes here!" class="something">test</div><div this-html="another test">Option is visible on preview/publish</div>';
var result = stringTest.replace(regex,'<$1$2$4>$3</$1>');
alert(result);
Upvotes: 0