Reputation: 45
I have 5 html files and I have a search form that I would like to use to search for text in these html files .
<form>
<input type ='text' />
<input type ='submit' />
</form>
I have an idea of using xmlhttprequest to get the files
var xhr = new XMLHttpRequest();
xhr.open("GET", "file1.html", false);
xhr.send();
var guid = xhr.responseText;
var xhr = new XMLHttpRequest();
xhr.open("GET", "file2.html", false);
xhr.send();
var guid = xhr.responseText;
...
then search for text in these files but I don't know how to search in the files using javascript.
How to search the files after getting it using xmlhttprequest ? Or Is there is another way to do the search using javascript ?
Upvotes: 4
Views: 17440
Reputation: 1750
First, change:
<input type ='text' />
To:
<input id= 'text' type='text' />
Then, the code below will create an array called 'files' made up of objects. The 'position' property of each object will contain either the position of 'text' within 'filename', -1 if the text is not found, or -2 if the file did not load.
var text = document.getElementById('text' )
loadCount = 0;
files = [];
files[ 0 ] = {};
files[ 0 ][ 'filename' ] = "file1.html";
files[ 1 ] = {};
files[ 1 ][ 'filename' ] = "file2.html";
files[ 2 ] = {};
files[ 2 ][ 'filename' ] = "file3.html";
files[ 3 ] = {};
files[ 3 ][ 'filename' ] = "file4.html";
files[ 4 ] = {};
files[ 4 ][ 'filename' ] = "file5.html";
function search( item, index ) {
xmlhttp.onload = function () {
var files[ index ][ 'contents' ] = xhr.responseText;
if ( typeof files[ index ][ 'contents' ] !== 'undefined' ) {
files[ index ][ 'position' ] = str.indexOf( text );
} else {
files[ index ][ 'position' ] = -2;
}
loadCount = loadCount + 1;
if ( loadCount == 5 ) {
// do whatever you want here
}
}
var xhr = new XMLHttpRequest();
xhr.open( "GET", item[ 'filename' ], false );
xhr.send();
}
files.forEach( search );
Upvotes: -1
Reputation: 23382
I'd use the DOMParser
to make sure we're doing some "smart" searching. Let's say you are looking for texts about the word "viewport"; you don't want any HTML file that has the <meta>
tag "viewport" to return as a valid result, would you?
Step one is parsing the string to a Document instance:
const parseHTMLString = (() => {
const parser = new DOMParser();
return str => parser.parseFromString(str, "text/html");
})();
Put a valid HTML string in here, and you'll get a document in return that behaves just like window.document
! This means we can do all kinds of cool stuff like using querySelector
and properties like innerText
.
The next step is to define what we want to search. Here's an example that joins in a document's title and body text:
const getSearchStringForDoc = doc => {
return [ doc.title, doc.body.innerText ]
.map(str => str.toLowerCase().trim())
.join(" ");
};
Pass your parsed document to this function, and you'll get a plain string in return that features just content, without attributes, tag names and meta data.
Now, it's a matter of defining the right search method. Could be a RegExp based match, or just a (less fast) split
& includes
:
const stringMatchesQuery = (str, query) => {
return query
.toLowerCase()
.split(/\W+/)
.some(q => str.includes(q))
};
Chain those methods together and you got the conversion like:
String -> Document -> String -> Boolean
If you ever want to include more information in the search content, you just update the getSearchStringForDoc
function using the standardized API.
A running example (that's a bit messy and could do with some refactoring, but hopefully gets the point across):
const htmlString = (
`<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>The title</title>
</head>
<body>
Some text about an interesting thing.
</body>
</html>`);
const parseHTMLString = (() => {
const parser = new DOMParser();
return str => parser.parseFromString(str, "text/html");
})();
const getSearchStringForDoc = doc => {
return [
doc.title,
doc.body.innerText
].map(str => str.trim())
.join(" ");
};
const stringMatchesQuery = (str, query) => {
str = str.toLowerCase();
query = query.toLowerCase();
return query
.split(/\W+/)
.some(q => str.includes(q))
};
const htmlStringMatchesQuery = (str, query) => {
const htmlDoc = parseHTMLString(str);
const htmlSearchString = getSearchStringForDoc(htmlDoc);
return stringMatchesQuery(htmlSearchString, query);
};
console.log("Match 'viewport':", htmlStringMatchesQuery(htmlString, "viewport"));
console.log("Match 'Interesting':", htmlStringMatchesQuery(htmlString, "Interesting"));
Upvotes: 5