Evgeniy
Evgeniy

Reputation: 2605

Find matchesm between parts of strings in Excel

I have a table with urls, like

    A   B
1   example.com/id/1/baz/   site.com/id/2/ban/
2   example.com/id/3/boom/  site.com/id/3/baff/
3   example.com/id/123/foo/ site.com/id/321/bar/

What i need is to compare IDs in corresponding cells of the same row (A1 vs. B1, A2 vs. B2) and write into C false if IDs are different and true, if they are same.

The formula i currently use is

=LEFT(SUBSTITUTE(MID(A2,FIND("/id/",A1)+4,99),"/",REPT(" ",99)),99)=LEFT(SUBSTITUTE(MID(B1,FIND("/id/",B1)+4,99),"/",REPT(" ",99)),99)

But the formula works if urls are EXACTLY so, as in the cited table. If any url has nothing after ID, like example.com/id/1 the formula produces false negatives.

How could the formula be adjusted to proceed urls without an after-id-slug?

Upvotes: 0

Views: 105

Answers (5)

Terry W
Terry W

Reputation: 3277

Please see below approach using Power Query which is available in Excel 2010 and later versions. Mine is using Excel 2016.

PQ

Here are the codes behind the scene for reference only. All steps are using built-in functions without advanced coding.

let
    Source = Excel.CurrentWorkbook(){[Name="Table4"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}}),
    #"Duplicated Column" = Table.DuplicateColumn(#"Changed Type", "Column1", "Column1 - Copy"),
    #"Duplicated Column1" = Table.DuplicateColumn(#"Duplicated Column", "Column2", "Column2 - Copy"),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Duplicated Column1", "Column1 - Copy", Splitter.SplitTextByEachDelimiter({"/id/"}, QuoteStyle.Csv, false), {"Column1 - Copy.1", "Column1 - Copy.2"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Column1 - Copy.1", type text}, {"Column1 - Copy.2", type text}}),
    #"Split Column by Delimiter1" = Table.SplitColumn(#"Changed Type1", "Column1 - Copy.2", Splitter.SplitTextByEachDelimiter({"/"}, QuoteStyle.Csv, false), {"Column1 - Copy.2.1", "Column1 - Copy.2.2"}),
    #"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Column1 - Copy.2.1", Int64.Type}, {"Column1 - Copy.2.2", type text}}),
    #"Split Column by Delimiter2" = Table.SplitColumn(#"Changed Type2", "Column2 - Copy", Splitter.SplitTextByEachDelimiter({"/id/"}, QuoteStyle.Csv, false), {"Column2 - Copy.1", "Column2 - Copy.2"}),
    #"Changed Type3" = Table.TransformColumnTypes(#"Split Column by Delimiter2",{{"Column2 - Copy.1", type text}, {"Column2 - Copy.2", type text}}),
    #"Split Column by Delimiter3" = Table.SplitColumn(#"Changed Type3", "Column2 - Copy.2", Splitter.SplitTextByEachDelimiter({"/"}, QuoteStyle.Csv, false), {"Column2 - Copy.2.1", "Column2 - Copy.2.2"}),
    #"Changed Type4" = Table.TransformColumnTypes(#"Split Column by Delimiter3",{{"Column2 - Copy.2.1", Int64.Type}, {"Column2 - Copy.2.2", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type4", "Custom", each [#"Column1 - Copy.2.1"]=[#"Column2 - Copy.2.1"]),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Column1", "Column2", "Custom"})
in
    #"Removed Other Columns"

Upvotes: 0

Ron Rosenfeld
Ron Rosenfeld

Reputation: 60414

If you have Excel 2013+, with the FILTERXML function, you can create an XML out of your string using the / as the node markers, and then examine the node that follow id.

Won't matter if there is nothing following:

=FILTERXML("<t><s>" &SUBSTITUTE(A1,"/","</s><s>") & "</s></t>","//t[contains(.,'id')]/s/following-sibling::s[2]")=
FILTERXML("<t><s>" &SUBSTITUTE(B1,"/","</s><s>") & "</s></t>","//t[contains(.,'id')]/s/following-sibling::s[2]")

enter image description here

Upvotes: 1

Scott Craner
Scott Craner

Reputation: 152660

You can search the substring in the other:

=ISNUMBER(SEARCH(LEFT(MID(A1,FIND("/id/",A1),LEN(A1))&"/",FIND("}}}",SUBSTITUTE(MID(A1,FIND("/id/",A1),LEN(A1))&"/","/","}}}",3))),B1&"/"))

![enter image description here

Upvotes: 1

EEM
EEM

Reputation: 6659

Try this formula, similar to the one posted by @JvdV

=TRIM(LEFT(SUBSTITUTE(MID(A1,4+FIND("/id/",A1),LEN(A1)),"/",REPT(" ",LEN(A1))),LEN(A1)))
=TRIM(LEFT(SUBSTITUTE(MID(B1,4+FIND("/id/",B1),LEN(B1)),"/",REPT(" ",LEN(B1))),LEN(B1)))

Upvotes: 0

JvdV
JvdV

Reputation: 75960

You could try:

=TRIM(MID(SUBSTITUTE(A1,"/",REPT(" ",LEN(A1))),2*LEN(A1)+1,LEN(A1)))=TRIM(MID(SUBSTITUTE(B1,"/",REPT(" ",LEN(B1))),2*LEN(B1)+1,LEN(B1)))

This would simply break up the string into seperate words with the "/" as a delimiter. That way you are comparing the nth word of the first string with the nth worth of another. In this case the 3rd "word", being the id, of A1 and B1.

Change ..2*.. into ..3*.. to compare the 4th word etc.

Upvotes: 2

Related Questions