Reputation: 473
I'm have two tables: A and B, each of them contains strings. I need to find all the matches between table A and table B when the string of table A contains the string of table B (B.stringColumn is substring of A.stringColumn).
I did it by the following code using CHARINDEX()
function:
SELECT *
FROM A, B
WHERE CHARINDEX(B.stringColumn, A.stringColumn) > 0
Is there more efficient way to do so?
This tables may contain massive amount of data, that why i'm asking this question.
Thanks in advance, Nuriel
Upvotes: 0
Views: 1158
Reputation: 356
Hm, you can use LIKE operator. This will not maybe give you some significant performance upgrade, but it has a chance of using index (if there is an index on that column), while with functions in where clause, indexes will never be used.
Upvotes: 0
Reputation: 1270683
I would be more inclined to phrase this as:
SELECT *
FROM A JOIN
B
ON A.stringColumn LIKE '%' + B.stringColumn + '%';
Alas, that does not help performance at all. The problem is that looking for patterns in the middle of a string cannot make use of optimizations or any other techniques. You are stuck with nested loop join algorithms.
In some cases, you might be able to use a full text index. This can be a bit challenging when joining between two tables.
There is a technical solution to this problem. It involves indexes built on something called n-grams (say, 3-character combinations). However, SQL Server does not support this indexing type.
Upvotes: 2