Liu Kang
Liu Kang

Reputation: 1389

Retrieve data from a specific column based on multiple criteria (comma separated values) in Google Sheets

I'm trying to make a Google Sheet-formula that matches words in a cell based on a multiple criteria (comma separated) cell and returns a specific value.

Sheet1 contains phrases and a formula:

Phrases Formula
fruit bat {formula}
fruit bats {formula}
fruitbat {formula}
big fruit bat {formula}
fruit bat eating orange {formula}
orange fruit {formula}
apple is a good fruit {formula}
i like pie {formula}
pies for lunch {formula}
apple pie is better {formula}
juice from apples {formula}

Sheet2 search values (comma separated) and corresponding return values:

Comma-separated search values Return Value
fruit, bat Animal - Bats
fruit, orange, bat Animal - Orange bat
orange, fruit Food - Orange
apple, fruit Food - Apple
pie Food - Pie
apple, pie Food - Apple Pie
juice, apple Beverage - Apple

I want Sheet1 to look like this in the end:

Phrases Formula
fruit bat Animal - Bats
fruit bats Animal - Bats
fruitbat Animal - Bats
big fruit bat Animal - Bats
fruit bat eating orange Animal - Orange bat
orange fruit Food - Orange
apple is a good fruit Food - Apple
i like pie Food - Pie
pies for lunch Food - Pie
apple pie is better Food - Apple Pie
juice from apples Beverage - Apple

Right now my formula looks like this: =IFNA(VLOOKUP(IFNA(REGEXEXTRACT(LOWER(A2); LOWER(TEXTJOIN("|"; 1; SORT(Sheet2!A:A; 1; 0))))); Sheet2!A:B; 2; 0))

The formula correctly returns "Food - Pie" on the phrases "i like pie" and "pies for lunch". It incorrectly returns "Food - Pie" on the phrase "apple pie is better", and all other rows are empty.

Is it possible to modify the current formula so it's compatible with search for comma separated values?

Note that the matching should be wildcard, and not dependent on spaces.

Upvotes: 0

Views: 178

Answers (1)

NightEye
NightEye

Reputation: 11184

If and only if custom formula is an option due to unavailability of other answers, see the code below.

Code:

function getRegexMatch(string, exprs) {
  // sort descending based on search length to get max matches
  exprs = exprs.sort(function(a, b){
    return b[0].length - a[0].length;
  });
  // filter pattern where number of matches is equal to number of search values
  var output = exprs.filter(function([expr, value]){ 
    var patterns = expr.split(", ");
    var result = patterns.filter(function(pattern) {
      if(RegExp(pattern).exec(string))
        return true;
    });
    if (patterns.length == result.length)
      return true;
  });
  // return the value of max match pattern
  return output[0][1];
}

Output:

output

Note:

  • Search-return value pair were sorted based on the length of its search values (longer length comes first) before processing as we want to get the return value where all search values are matching the string. This is to prevent returning the value where not all search values match and other search values that have lower number of matches than the number of search values (e.g. fruit bat eating orange returning Animal - Orange bat instead of Animal - Bats which comes first if not sorted)
  • Given that we sorted the search-return value pairs based on its search value length (longer length comes first), it means that it will always encounter the best possible match first.
  • So when we see the first match, return it immediately.
  • The code above is called one by one per cell, but it can be modified to just call it once and return all column A's matches on column B.

Upvotes: 1

Related Questions