Umar.H
Umar.H

Reputation: 23099

RegexExtract multiple values from a single row from an array

I have a table of products

Product
Vegetable
Soup
Meat
Fish

I have a list of items:

Item
Fish and Vegetable Soup
meat soup
Fish and Vegetable

now i want to extract each match from above via a regex function

I thought it would be as as simple as

{=REGEXEXTRACT(A2,TEXTJOIN('|', TRUE,PRODUCTS!A2:A))} 

however this returns none or just the first match in my list.

target output:

Item                    |Expected Output
Fish and Vegetable Soup |Fish, Vegetable, Soup
meat soup               |Meat, Soup
Fish and Vegetable      |Fish, Vegetable

demo sheet here

given that D2 is the text I want to use the expression on, and A2:A is my list of products.

Thanks!

Upvotes: 2

Views: 1032

Answers (1)

Umar.H
Umar.H

Reputation: 23099

Figured it out:

Short Version:

=TEXTJOIN(", ",TRUE,ARRAYFORMULA(IFERROR(REGEXEXTRACT(SPLIT(D4," ",True),TEXTJOIN("|",TRUE,A2:A)),"")))

as the regexextract function ends after the first match we need to do the following to make it iterate over my row.

first we use split and split each item in each cell by whitespace ' '

=split(A2, " ")

which gives us

Split Array

We wrap this in an ArrayFormula which according to google's helper function :

Enables the display of values returned from an array formula into multiple rows and/or columns and the use of non-array functions with arrays

next step is to use the regextract function, with the help of the ArrayFormula lets us match each element returned from the split

=ARRAYFORMULA(REGEXEXTRACT(SPLIT(B2, " "),TEXTJOIN("|", TRUE, A2:A)

textjoin returns a single cell with values seperated by our chosen delimiter | in some programming languages | is an OR statement which gives us.

enter image description here

to top it off, we use an IFERROR statement to ignore the #NAs we don't care about those. and a further TEXTJOIN to bring it all in one cell sepeareted by comma.

=TEXTJOIN(", ",TRUE,ARRAYFORMULA(IFERROR(REGEXEXTRACT(SPLIT(D2," ",True),TEXTJOIN("|",TRUE,A2:A)),"")))

enter image description here

Upvotes: 4

Related Questions