Reputation: 578
I have a programming challenge, and I'm wondering what the most bug-free way to approach it is.
Basically, I have the following HMTL:
<p id="first">
Hello lorem ispum
<a id="link" href="...">Link</a>
linkety link blag
</p>
(the id's are for proof of concept in getting by getElementById: in reality, I get the DOM references element-by-element parsing the page).
The "Hello lorem ispum" and "linkety link blag" text fragments are orphaned -- I cannot directly access them. I can only access the whole thing with the paragraph tag, or the inside "a" tag.
What I would like is an array of elements of the stuff in the paragraph. If they need to get wrapping tags or something in order to get a reference to modify with JavaScript, that's OK. E.G., end result:
para[0] = <span>Hello lorem ispum</span>
para[1] = <a id="link" href="...">Link</a>
para[2] = <span>linkety link blag</span>
DOM Objects that I can change/access linking to what's on the page (NOT strings).
Would it just be a bunch of parsing the paragraph tag's innerHTML?
This is all for an open source Chrome plugin for disabilities in reading text by simply using up and down arrow keys. If you have any better ideas of how to approach this problem, please let me know!
Upvotes: 0
Views: 608
Reputation: 12599
var paragraph = document.getElementById('first'),
list = paragraph.childNodes,
l = list.length,
el, container, i = 0, result = [];
for(i; i < l; i++) {
el = list[i];
if (el.nodeType === 3) {
container = document.createElement('span');
container.className = 'wrapper';
// we want to remove line breaks from the text
container.innerText = el.nodeValue.replace(/(\r\n|\n|\r)/gm,"");
el = container;
}
result.push(el);
}
The reason we want to remove line breaks from the text nodes is that those will be converted into <br>
when in a <span>
. Don't think this is what you need.
In your particular case, result
will be something like:
[SPAN, LINK, SPAN]
Upvotes: 1
Reputation: 97717
Try this, it creates a span with content of the text node and replace it with the text node
var p = document.getElementById('first');
var elements = [];
for (var i = 0; i < p.childNodes.length; i++) {
var child = p.childNodes[i];
if (child.nodeType == 3) {//text node
if (! /^\s+$/.test(child.nodeValue)){//skip whitespaces
var span = document.createElement('span');
span.innerHTML = child.nodeValue;
p.replaceChild(span, child);
elements.push(span);
}
}
else if (child.nodeType == 1){//element node
elements.push(child)
}
}
http://jsfiddle.net/mowglisanu/t6UaJ/
Upvotes: 1
Reputation: 55750
You can iterate over the childNodes
var para = document.getElementById('first');
var arr = [];
for (var i = 0; i < para.childNodes.length; i++) {
var elem = para.childNodes[i];
if (elem.nodeType === 3) {
var newElem = document.createElement('span');
newElem.className = 'a';
newElem.innerHTML = trim(elem.nodeValue);
elem.parentNode.insertBefore(newElem, elem.nextSibling);
para.removeChild(elem);
arr.push(newElem);
}
else {
arr.push(elem)
}
}
console.log(arr);
function trim(str) {
return str.replace(/^\s+|\s+$/g, "");
}
Upvotes: 1
Reputation: 708036
You can grab the text from the text nodes that aren't in other elements like this by walking the child nodes of the <p>
tag and looking at the nodeType to see which nodes are text nodes:
var top = document.getElementById("first");
var node = top.firstChild;
while (node) {
// get text from text nodes that aren't contained in elements
if (node.nodeType === 3) {
// node.nodeValue is the text in the text node
} else if (node.nodeType === 1) {
// node is an element here so you can get innerHTML or textContent or whatever you want
}
node = node.nextSibling;
}
Working demo: http://jsfiddle.net/jfriend00/YvBpw/
If you just want the plain text from the whole <p>
tag (including all elements) and do it cross browser, you can use this:
var t = document.getElementById("first");
var text = t.textContent || t.innerText;
This will be an HTML-stripped text conversion of everything in the <p>
tag.
Upvotes: 0