Reputation: 811
Problem:
I want to test a set of values is equal to another set but not necessary their order of position will be same. For example: 'a,b,c,d' must be equal to 'b,a,c,d'
What I have tried:
I have tried IN Clause and I have checked with FIND_IN_SET.
SELECT 'a,b,c,d' IN 'b,c,a,d';
Both of them can not do this work.
Will be thankful if anyone can help.
Thanks Sandeep
Upvotes: 1
Views: 1165
Reputation: 10346
This demonstrates the use the splitting of values to multiple rows, mentioned by GolezTrol in combination with FIND_IN_SET
, modified to function to be used in forms like:
SELECT are_sets_equal(col_with_set, 'a,b,d,c') FROM example;
or
SELECT * FROM example
WHERE are_sets_equal(col_with_set, 'a,b,d,c')
The idea is this:
The limit for both sets is 1000 values, but could be expanded easily:
DELIMITER //
CREATE FUNCTION are_sets_equal(set_a VARCHAR(2000), set_b VARCHAR(2000)) RETURNS BOOLEAN
BEGIN
DECLARE is_equal BOOLEAN;
DECLARE count_a INT;
DECLARE count_b INT;
-- calculate the count of elements in both sets
SET count_a = 1 + LENGTH(set_a) - LENGTH(REPLACE(set_a, ',', ''));
SET count_b = 1 + LENGTH(set_b) - LENGTH(REPLACE(set_b, ',', ''));
SELECT
-- if all elements of the first set are contained in the second
-- set and both sets have the same number of elements then both
-- sets are considered equal
COUNT(t.value) = count_a AND count_a = count_b INTO is_equal
FROM (
SELECT
SUBSTRING_INDEX(SUBSTRING_INDEX(e.col, ',', n.n), ',', -1) value
FROM ( SELECT set_a AS col ) e
CROSS JOIN(
-- build for up to 1000 separated values
SELECT
a.N + b.N * 10 + c.N * 100 + 1 AS n
FROM
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) c
ORDER BY n
) n
WHERE n.n <= count_a
) t
WHERE FIND_IN_SET(t.value, set_b);
return is_equal;
END //
DELIMITER ;
Building a numbers table
SELECT
a.N + b.N * 10 + c.N * 100 + 1 AS n
FROM
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) c
ORDER BY n
builds a number table with the values from 1 to 1000 on the fly. How to expand this to a greater range should be obvious.
Note Such a numbers table could be contained in your database, so there would be no need to create one on the fly.
Split a set to a table
With the help of this number table we can split the value list to a table, using nested SUBSTRING_INDEX
calls to cut just one value after the other from the list as mentioned in SQL split values to multiple rows:
SELECT
SUBSTRING_INDEX(SUBSTRING_INDEX(t.col, ',', n.n), ',', -1) value
FROM (SELECT @set_a as col ) t CROSS JOIN (
-- build for up to 100 separated values
SELECT
a.N + b.N * 10 + c.N * 100 + 1 AS n
FROM
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) c
ORDER BY n
) n
WHERE
n <= 1 + LENGTH(@set_a) - LENGTH(REPLACE(@set_a, ',', ''))
Count the elements of the sets
We get the count of elements in the list by the expression in the WHERE
clause: we have one more values than occurences of the separator.
Then we restrict the result by searching those values in the second set with FIND_IN_SET
.
As a last step we check count of values in the result against the count of values in both sets and return this value.
Demo
Experiment with this demo.
Upvotes: 3
Reputation: 64496
You can use a UDF(user defined function) to compare your sets,As from comments no duplicate values will be in the sets,I have little customized a UDF function provided by @Simon at mso.net.I have calculated a count of values in a second comma separated list and in the end compared with the matched result of find_in_set
stored in numReturn
variable,If both equal the return 1 else return 0 for non matched.Please not this will not work for repeated/duplicate values in a set
DELIMITER $$
DROP FUNCTION IF EXISTS `countMatchingElements`$$
CREATE DEFINER = `root` @`localhost` FUNCTION `countMatchingElements` (
inFirstList VARCHAR (1000),
inSecondList VARCHAR (1000)
) RETURNS TINYINT (3) UNSIGNED NO SQL DETERMINISTIC SQL SECURITY INVOKER
BEGIN
DECLARE numReturn TINYINT UNSIGNED DEFAULT 0 ;
DECLARE idsInFirstList TINYINT UNSIGNED ;
DECLARE currentListItem VARCHAR (255) DEFAULT '' ;
DECLARE currentID TINYINT UNSIGNED ;
DECLARE total_values_in_second INT DEFAULT 0 ;
SET total_values_in_second = ROUND(
(
LENGTH(inSecondList) - LENGTH(REPLACE (inSecondList, ',', ''))
) / LENGTH(',')
) + 1 ;
SET idsInFirstList = (CHAR_LENGTH(inFirstList) + 1) - CHAR_LENGTH(REPLACE(inFirstList, ',', '')) ;
SET currentID = 1 ;
-- Loop over inFirstList, and for each element that is in inSecondList increment numReturn
firstListLoop :
REPEAT
SET currentListItem = SUBSTRING_INDEX(
SUBSTRING_INDEX(inFirstList, ',', currentID),
',',
- 1
) ;
IF FIND_IN_SET(currentListItem, inSecondList)
THEN SET numReturn = numReturn + 1 ;
END IF ;
SET currentID = currentID + 1 ;
UNTIL currentID > idsInFirstList
END REPEAT firstListLoop ;
IF total_values_in_second = numReturn
THEN RETURN 1 ;
ELSE RETURN 0 ;
END IF ;
END $$
DELIMITER ;
Upvotes: 0
Reputation: 116180
FIND_IN_SET
should do the trick, but the first value is an individual value and doesn't work right if it contains a comma. You would have to look for each individual value:
SELECT
FIND_IN_SET('a', 'b,c,a,d') AND
FIND_IN_SET('b', 'b,c,a,d') AND
FIND_IN_SET('c', 'b,c,a,d') AND
FIND_IN_SET('d', 'b,c,a,d')
If you don't have these separate values available, maybe you can split the input value into multiple values. The answers to the question 'Split values to multiple rows' might give you some inspiration.
The better solution would be not to store comma separated values at all. It's considered bad practice.
Upvotes: 3