Ontokrat
Ontokrat

Reputation: 189

Search for multiple elements in an array

I want to retrieve inside an array all the elements who match multiple strings (all of them & not necessary words): like a search engine returning all results matching term_searched#1 && term_searched#2.

It's not a question about duplicates in the array (there's none), but about searching for a conjunction of elements: traditionally, the search is for one element, by himself or in disjunction with others (a|b|c). Just want to search (a && b && c).

I tried:

The first regex expression works, but not at every time: seems very fragile...

So I don't know if I'm in the good direction (match) or if I can't figure what is the right regex expression... Need your advices.

// filter grid by searching on 'input' event
'input #search': (e)=> {
    var keypressed = e.currentTarget.value;

    // create array on 'space' input
    var keyarr = keypressed.toLowerCase().split(" ");

    // format each array's element into regex expression
    var keyarrReg = [];
    for(i = 0; i < keyarr.length; i++) {
        var reg = '(?=' + keyarr[i] + ')';
        keyarrReg.push(reg);
    }

    // array to regex string into '/(?=element1).*(?=element2)/gim' format
    var searching = new RegExp(keyarrReg.join(".*"), 'mgi');

    // set grid
    var grid = new Muuri('#gridre', {
        layout: {
            fillGaps: true,
        }
    });

    if (keypressed) {
        // filter all grid's items (grid of items is an array)
        grid.filter(function (item) {
            var searchoperator = item.getElement().textContent.toLowerCase().match(searching);
            // get items + only their text + lower case their text + return true (not false) in the value ('keypressed') is found in them
            //var searchoperator = item.getElement().textContent.toLowerCase().indexOf(keypressed.toLowerCase()) != -1;
            return searchoperator;
        }
        [....]

    }
}

Edit with Gawil's answer adapted to my initial code (to help if needed)

// filter grid by searching on 'input' event
'input #search': (e)=> {
    var keypressed = e.currentTarget.value;

    // create array on 'space' input
    var keyarr = keypressed.toLowerCase().split(" ");

    // convert the array to a regex string, in a '^(?=.*word1)(?=.*word2).*$' format
    // here is Gawil's answer, formatted by Teemu 
    var searching = new RegExp('^(?=.*' + keyarr.join(')(?=.*') + ').*$', 'm');

    // set grid
    var grid = new Muuri('#gridre', {
        layout: {
            fillGaps: true,
        }
    });

    if (keypressed) {
        // filter all grid's items (grid of items is an array)
        grid.filter(function (item) {
            // get items + only their text + lower case their text + delete space between paragraphs
            var searchraw = item.getElement().textContent.toLowerCase().replace(/\r\n|\n|\r/gm,' ');
            var searchoperator = searchraw.match(searching);
            return searchoperator;
        }
        [....]

    }
}

Upvotes: 0

Views: 1833

Answers (2)

Gawil
Gawil

Reputation: 1211

The code bellow will log each element of the array containing words cats and dogs.
It uses the regex ^(?=.*word1)(?=.*word2).*$
To handle new lines, use this one instead :
^(?=(?:.|\n)*word1)(?=(?:.|\n)*word2).*$

You can add as many words as you want following the same logic, and it does not take order of the words in count.

It is very similar to what you tried, except that you have to do all (?=) checks before matching the string. Indeed, your first regex works only when the words are in the right order (element1 and then element2). Your second regex almost works, but you wrote only lookaheads, so it checks the presence of each word, but won't match anything.

var words = ["cats", "dog"]
var array = [
  "this is a string",
  "a string with the word cats",
  "a string with the word dogs",
  "a string with both words cats and dogs",
  "cats rule everything",
  "dogs rule cats",
  "this line is for dog\nbut cats prefer this one"
]

var regexString = "^";
words.forEach(function(word) { regexString += ("(?=(?:.|\n)*"+word+")"); });

var regex = new RegExp(regexString);

array.forEach(function(str) { // Loop through the array
  if(str.match(regex)) {
    console.log(str); // Display if words have been found
  }
});
  

Upvotes: 3

Teemu
Teemu

Reputation: 23396

If I've correctly understood your question, you've an array of strings, and some keywords, which have to be found from every index in the array to be accepted in the search results.

You can use a "whitelist", i.e. a regExp where the keywords are separated with |. Then iterate through the array, and on every member create an array of matches against the whitelist. Remove the duplicates from the matches array, and check, that all the keywords are in the list simply by comparing the length of the matches array to the count of the keywords. Like so:

function searchAll (arr, keywords) {
    var txt = keywords.split(' '),
    len = txt.length,
    regex = new RegExp(txt.join('|'), 'gi'), // A pipe separated whitelist
    hits; // The final results to return, an array containing the contents of the matched members
    // Create an array of the rows matching all the keywords
    hits = arr.filter(function (row) {
        var res = row.match(regex), // An array of matched keywords
           final, temp;
        if (!res) {return false;}
        // Remove the dups from the matches array
        temp = {}; // Temporary store for the found keywords
        final = res.filter(function (match) {
    	    if (!temp[match]) {
                // Add the found keyword to store, and accept the keyword to the final array
      	        return temp[match] = true;
            }
            return false;
        });
        // Return matches count compared to keywords count to make sure all the keywords were found
        return final.length === len;
    });
    return hits;
}

var txt = "Some text including a couple of numbers like 8 and 9. More text to retrieve, also containing some numbers 7, 8, 8, 8 and 9",
  arr = txt.split('.'),
  searchBut = document.getElementById('search');
  
searchBut.addEventListener('change', function (e) {
  var hits = searchAll(arr, e.target.value);
  console.log(hits);
});
<input id="search">

The advantage of the whitelist is, that you don't have to know the exact order of the keywords in the text, and the text can contain any characters.

Upvotes: 1

Related Questions