Reputation: 56
I need to create a new DAX column that will search a string from another column in the same table. It will search for any of the values in a 2nd table, and return True if any of those values are found. Simplified example:
Let's say I have a table named Sentences with 1 column:
Sentences
Col1
----------------
"The aardvark admitted it was wrong"
"The attractive peanut farmer graded the term paper"
"The awning was too tall to touch"
And another table named FindTheseWords with a list of values
FindTheseWords
Col1
----------------
peanut
aardvark
I'll be creating Col2 in the Sentences table, which should return
Sentences
Col1 Col2
---------------------------------------------------- ------------------------
"The aardvark admitted it was wrong" TRUE
"The attractive peanut farmer graded the term paper" TRUE
"The awning was too tall to touch" FALSE
The list of FindTheseWords is actually pretty long, so I can't just hardcode them and use an OR. I need to reference the table. I don't care about spaces, so a sentence with "peanuts" would also return true.
I've seen a good implementation of this in M, but the performance of my load took a pretty good hit, so I'm hoping to find a DAX option for a new column. The M Solution, for reference: How to search multiple strings in a string?
Upvotes: 0
Views: 2292
Reputation: 4346
fact table
| Column1 |
|------------------------------------------------------|
| The aardvark admitted it was wrong |
| The attractive peanut farmer graded the term paper |
| The awning was too tall to touch |
| This is text string |
| Tester is needed |
sentence table
| Column1 |
|------------|
| attractive |
| peanut |
| aardvark |
| Tester |
Calculated column
Column =
VAR _1 =
ADDCOLUMNS ( 'fact', "newColumn", SUBSTITUTE ( 'fact'[Column1], " ", "|" ) )
VAR _2 =
GENERATE (
_1,
ADDCOLUMNS (
GENERATESERIES ( 1, PATHLENGTH ( [newColumn] ) ),
"Words", PATHITEM ( [newColumn], [Value], TEXT )
)
)
VAR _3 =
ADDCOLUMNS (
_2,
"test", CONTAINS ( VALUES ( sentence[Column1] ), sentence[Column1], [Words] )
)
VAR _4 =
DISTINCT (
SELECTCOLUMNS (
FILTER ( _3, [test] = TRUE ),
"Column1", [Column1] & "",
"test", [test] & ""
)
)
VAR _5 =
DISTINCT (
SELECTCOLUMNS (
FILTER ( _3, [test] = FALSE ),
"Column1", [Column1] & "",
"test", [test] & ""
)
)
VAR _7 =
FILTER ( _5, [Column1] = MAXX ( _4, [Column1] ) )
VAR _8 =
UNION ( _4, _7 )
RETURN
MAXX (
FILTER ( _8, [Column1] = CALCULATE ( MAX ( 'fact'[Column1] ) ) ),
[test]
)
Upvotes: 1