yhd.leung
yhd.leung

Reputation: 1562

Google Sheet - Find partial match between ranges / columns

On Google Sheet, given two ranges / columns of data, lookup column A based on its partial match on column B, show empty if no matches.
Particularly in my case, given a list of blocks and a list of rooms with block prefix, identify the block for each room.

+--------+---------+------------------------+
| Block  |  Room   | Block of Room (Output) |
+--------+---------+------------------------+
| AD     | AD201   | AD                     |
| AE     | AD208/3 | AD                     |
| HG     | ADG07   | AD                     |
| HH     | ADUG8   | AD                     |
| HSH    | BY03    |                        |
|        | HG03    | HG                     |
|        | HGG01   | HG                     |
|        | HSH01   | HSH                    |
|        | HSHG5   | HSH                    |
+--------+---------+------------------------+

I tinkered with search(), index() and match() like this: =IFERROR(if(search(index($A$2:$A,MATCH(B2,$A$2:$A,1),1), B2), index($A$2:$A,MATCH(B2,$A$2:$A,1),1), ""))
but it didn't work well with arrayformula().

Finally I come up with a solution like this
=ARRAYFORMULA( iferror(IF(SEARCH( VLOOKUP(B2:B,A2:A,1), B2:B), VLOOKUP(B2:B,A2:A,1), "")))

I wonder if there exists more elegant ways to do the matching.

Upvotes: 2

Views: 4475

Answers (2)

Andrew Apell
Andrew Apell

Reputation: 23

You could use Google Sheets add-ons like one I created called Flookup. It should do exactly what you want.

For this case, we are going to use a function called FLOOKUP. This is its syntax and what each parameter represents:

FLOOKUP (lookup_value, table_array, lookup_col, index_num, threshold, [rank])
  • lookup_value: the value you're looking up
  • table_array: the table you want to search
  • lookup_col: the column you want to search
  • index_num: the column you want data to be returned from
  • threshold: the percentage similarity below which data shouldn't be returned
  • rank: the nth best match (i.e. if the first one isn't to your liking)

I have decided to use a threshold value of 0.55 because, by scanning your dataset, I observed that the items in column A are about half the length of those in column B in general. This is a very non-scientific way of estimation, but it makes for a good starting point:

=FLOOKUP($A$1:$A$5, $B$1:$B$9, 1, 1, 0.55)

This formula picks every item in column A and searches for matches in column B. Matches are considered to be any text entries in column A that have a 0.55 level of similarity or higher to the items in column B.

Upvotes: 0

TheMaster
TheMaster

Reputation: 50799

=ARRAYFORMULA(IFERROR(REGEXEXTRACT(B2:B,"^("&TEXTJOIN("|",1,A2:A)&")")))
  • JOIN blocks by |(= or in regex)
  • Use that regex to extract block from room
  • "^" represents start of string

=ARRAYFORMULA(VLOOKUP(LEFT(B2:B10,2)&"*",A2:A10,1,0))

Uses two characters from LEFT of Col B to lookup in A and return result. This should be faster than regex, but maybe a little less accurate.

Upvotes: 3

Related Questions