Reputation: 1
I have fields within a table with values originally filled in by hand. Even if the same values are entered/meant, there can be "slight" deviations. Now I want to compare whether values in 2 columns within a row are quite similar.
If there is some similarity, I would like to have a True in a new column, otherwise a False. The use case is similar to the fuzzy join when two tables are merged, but the fields are within a table and do not work as a primary key. I have created a table below of what this should look like:
No | A header | Another header | Calculated Column |
---|---|---|---|
1 | Zürich, 1.OG Telefonzentrale | Telefonzentrale | True |
2 | Mittelterrasse 1.OG Raum T190 | Mittelterrasse T1 | True |
1 | TM-Raum 225 | Bern, Bollwerk 10 / 2.OG | False |
2 | G7803 | 91G7803 | True |
It would be great if someone could help me in this topic.
Upvotes: 0
Views: 964
Reputation: 3741
I dont know if is there a way to do this, but we can try to verify how many word from column1 appear in column2:
CheckIfTrue__ =
VAR SplitByCharacter = " "
VAR Org = SELECTEDVALUE(Sheet3[A header])
VAR CurrentF = SELECTEDVALUE(Sheet3[Another header] )
VAR Table0 =
SELECTCOLUMNS(
ADDCOLUMNS (
GENERATE (
ROW ( "Text", Org),
VAR TokenCount =
PATHLENGTH ( SUBSTITUTE ( [Text], SplitByCharacter, "|" ) )
RETURN
GENERATESERIES ( 1, MAX(TokenCount,1) )
),
"Word", PATHITEM ( SUBSTITUTE ( [Text], SplitByCharacter, "|" ), [Value] )
),
"Word",[Word])
VAR Table1 =
SELECTCOLUMNS(
ADDCOLUMNS (
GENERATE (
ROW ( "Text", CurrentF),
VAR TokenCount =
PATHLENGTH ( SUBSTITUTE ( [Text], SplitByCharacter, "|" ) )
RETURN
GENERATESERIES ( 1, MAX(TokenCount,1) )
),
"Word", PATHITEM ( SUBSTITUTE ( [Text], SplitByCharacter, "|" ), [Value] )
),
"Word",[Word])
RETURN
COUNTROWS(INTERSECT(Table0, Table1))+0
Upvotes: 0