Reputation: 15
The code below is meant to put links in words, but it works only with english words, I would like it to work with arabic words too
The code
<script>
// <![CDATA[
document.addEventListener("DOMContentLoaded", function(){
var links = {
"مغامرات": "https://www.example.com/search/label/%D9%85%D8%BA%D8%A7%D9%85%D8%B1%D8%A7%D8%AA",
"East": "https://www.example.com/search/label/%D8%AE%D9%8A%D8%A7%D9%84",
}
var bodi = document.querySelectorAll("body *:not(script)");
for(var x=0; x<bodi.length; x++){
var html = bodi[x].innerHTML;
for(var i in links){
var re = new RegExp("([\\s| ]"+i+"(?:(?=[,<.\\s])))", "gi");
var matches = html.match(re);
if(matches){
matches = html.match(re)[0].trim();
html = html.replace(re, function(a){
return ' <a href="'+links[i]+'">'+a.match(/[A-zÀ-ú]+/)[0].trim()+'</a>';
});
}
}
bodi[x].innerHTML = html;
}
});
// ]]>
</script>
Upvotes: 0
Views: 259
Reputation: 1101
Let me change the way that you choose to a better and more understandable one.
In this example we made a simple function to detect words by a dynamic RegEx and replace an anchor (a) tag with link instead:
function linkWords(elem,words,links) {
// Using innerHTML to replace anchor links easier
elem.innerHTML=elem.innerHTML.replace(
// Make a (g:global, i:case insensitive) RegEx from joinig words by groups indicators
// (!) Group indicators will pass to the function arguments as their index
RegExp('('+words.join(')|(')+')','gi'),
// This function will get arguments like this:
// [match,parenthesized capture group...,offset,string]
function(){
// So we ignore the first one and the last two
for (var i=1;i<arguments.length-2;i++)
// If we found not undefined group
if (arguments[i])
// We return the captured match mixed with the anchor tag using the same index of the link
return '<a href="'+links[i-1]+'">'+arguments[0]+'</a>';
}
);
}
document.addEventListener("DOMContentLoaded", linkWords(document.body,
["كلمة","word"],
["https://www.example.com/search/%D9%83%D9%84%D9%85%D8%A9","https://www.example.com/search/word"]
));
<div>Hi! this is a word and it have to be linked.</div>
<div dir="rtl">السلام علیکم! هذه كلمة ويجب ربطها.</div>
For understanding well what happen up there, you can read more about these resources:
Important notes: That function is just a first step for understanding and used an experimental method
, not a standard (trust-able) method, because of these possibilities:
innerHTML
without special cares may not get the HTML Encode charactersي ى ی
Ya - 4 ٤ ۴
Numbers etc...Node.textContent
or attributes or non-readable tags texts like <style>
or <script>
Pre-fix
& Post-Fix
for inputs to don't make mistakes. Example; duplicates in inputs can make mess like: ['win','window'] or not detecting already linked wordsAlso usually these kind of acts should be Server-side to avoid many Client-side possible mistakes.
So if you want to keep doing it in Client-side (Front-End):
If we want to solve the problem of avoiding linking of already linked words, And we also want to look at the issue in a simplistic way, we can add a Negative Look-ahead in our pattern for improving the RegEx.
Live example, for understanding how it work:
https://regexr.com/6c00r
Visualized pattern:
https://jex.im/regulex/#!flags=ig&re=(%3F!%5B%5E%3E%5D*%3C%5C%2Fa%3E)(%3F%3A(word)%7C(%D9%83%D9%84%D9%85%D8%A9))
function linkWords(elem, words, links) {
elem.innerHTML = elem.innerHTML.replace(
// Improved RegEx by adding Negative lookahead that check not between <a>
RegExp('(?![^>]*</a>)(?:(' + words.join(')|(') + '))', 'gi'),
function() {
for (var i = 1; i < arguments.length - 2; i++)
if (arguments[i])
return '<a href="' + links[i - 1] + '">' + arguments[0] + '</a>';
}
);
}
document.addEventListener("DOMContentLoaded", linkWords(
document.querySelector('.me'), // <---- First argument choose the target element
["كلمة", "word"], // Array of the targeted words
["https://www.example.com/search/%D9%83%D9%84%D9%85%D8%A9", "https://www.example.com/search/word"] // Array of the words links
));
.me {background: #efefef;}
a {text-decoration: underline;}
<div class="me">
<div>Hi! this is a word and it have to be linked.</div>
<div dir="rtl">السلام علیکم! هذه كلمة ويجب ربطها.</div>
<div>Not this <a>word</a> that already is inside an anchor tag. But still this WORD.</div>
</div>
<br>
<div>Another element: word word word</div>
Explaining the RegEx:
(?![^>]*<\/a>)
the Negative Lookahed
(?!...)
will check:
(?:...)
is NOT:
[^>]
(ANY, NOT finished tag character)
*
</a>
Non-Capturing group
(?:...)
(to don't change the index of other groups)
(word)
(Group number 1)|
(کلمه)
(Group number 2)(word1)|(word2)|(word3)|...
/.../gi
:
g
(Global) Go next for morei
(Ignore Case) Be case insensitive (A=a)Upvotes: 1