sandeepKumar
sandeepKumar

Reputation: 811

How can match a string of comma separated , irrespective of their position order in Mysql

Problem:

I want to test a set of values is equal to another set but not necessary their order of position will be same. For example: 'a,b,c,d' must be equal to 'b,a,c,d'

What I have tried:

I have tried IN Clause and I have checked with FIND_IN_SET.

SELECT 'a,b,c,d' IN 'b,c,a,d';

Both of them can not do this work.

Will be thankful if anyone can help.

Thanks Sandeep

Upvotes: 1

Views: 1165

Answers (3)

VMai
VMai

Reputation: 10346

This demonstrates the use the splitting of values to multiple rows, mentioned by GolezTrol in combination with FIND_IN_SET, modified to function to be used in forms like:

SELECT are_sets_equal(col_with_set, 'a,b,d,c') FROM example;

or

SELECT * FROM example
WHERE are_sets_equal(col_with_set, 'a,b,d,c')

The idea is this:

  • Split the the first set to a temporary table
  • Check how many of those values are found in the second set.
  • If this count is equal to the count of elements in both sets, then the sets are equal
  • The function will return 1, if both sets are equal and 0, if the sets differ as by requirement.

The limit for both sets is 1000 values, but could be expanded easily:

DELIMITER //
CREATE FUNCTION are_sets_equal(set_a VARCHAR(2000), set_b VARCHAR(2000)) RETURNS BOOLEAN
BEGIN
  DECLARE is_equal BOOLEAN;
  DECLARE count_a INT;
  DECLARE count_b INT;

  -- calculate the count of elements in both sets 
  SET count_a = 1 + LENGTH(set_a) - LENGTH(REPLACE(set_a, ',', ''));
  SET count_b = 1 + LENGTH(set_b) - LENGTH(REPLACE(set_b, ',', ''));

  SELECT
    -- if all elements of the first set are contained in the second
    -- set and both sets have the same number of elements then both
    -- sets are considered equal
    COUNT(t.value) = count_a AND count_a = count_b INTO is_equal
    FROM (
      SELECT
        SUBSTRING_INDEX(SUBSTRING_INDEX(e.col, ',', n.n), ',', -1) value
      FROM ( SELECT set_a AS col ) e
      CROSS JOIN(
        -- build for up to 1000 separated values
        SELECT 
            a.N + b.N * 10 + c.N * 100 + 1 AS n
        FROM
            (SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
           ,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
           ,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) c
        ORDER BY n
    ) n
      WHERE n.n <= count_a
    ) t
    WHERE FIND_IN_SET(t.value, set_b);

    return is_equal;
END //
DELIMITER ;

Explanation

Building a numbers table

SELECT 
    a.N + b.N * 10 + c.N * 100 + 1 AS n
FROM
    (SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
   ,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
   ,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) c
ORDER BY n

builds a number table with the values from 1 to 1000 on the fly. How to expand this to a greater range should be obvious.

Note Such a numbers table could be contained in your database, so there would be no need to create one on the fly.

Split a set to a table

With the help of this number table we can split the value list to a table, using nested SUBSTRING_INDEX calls to cut just one value after the other from the list as mentioned in SQL split values to multiple rows:

SELECT
    SUBSTRING_INDEX(SUBSTRING_INDEX(t.col, ',', n.n), ',', -1) value
FROM (SELECT @set_a as col ) t CROSS JOIN (
    -- build for up to 100 separated values
    SELECT 
        a.N + b.N * 10 + c.N * 100 + 1 AS n
    FROM
        (SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
       ,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
       ,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) c
    ORDER BY n
) n
WHERE 
    n <= 1 + LENGTH(@set_a) - LENGTH(REPLACE(@set_a, ',', ''))

Count the elements of the sets

We get the count of elements in the list by the expression in the WHERE clause: we have one more values than occurences of the separator.

Then we restrict the result by searching those values in the second set with FIND_IN_SET.

As a last step we check count of values in the result against the count of values in both sets and return this value.

Demo

Experiment with this demo.

Upvotes: 3

M Khalid Junaid
M Khalid Junaid

Reputation: 64496

You can use a UDF(user defined function) to compare your sets,As from comments no duplicate values will be in the sets,I have little customized a UDF function provided by @Simon at mso.net.I have calculated a count of values in a second comma separated list and in the end compared with the matched result of find_in_set stored in numReturn variable,If both equal the return 1 else return 0 for non matched.Please not this will not work for repeated/duplicate values in a set

DELIMITER $$
DROP FUNCTION IF EXISTS `countMatchingElements`$$
CREATE DEFINER = `root` @`localhost` FUNCTION `countMatchingElements` (
  inFirstList VARCHAR (1000),
  inSecondList VARCHAR (1000)
) RETURNS TINYINT (3) UNSIGNED NO SQL DETERMINISTIC SQL SECURITY INVOKER 
BEGIN
  DECLARE numReturn TINYINT UNSIGNED DEFAULT 0 ;
  DECLARE idsInFirstList TINYINT UNSIGNED ;
  DECLARE currentListItem VARCHAR (255) DEFAULT '' ;
  DECLARE currentID TINYINT UNSIGNED ;
  DECLARE total_values_in_second INT DEFAULT 0 ;
  SET total_values_in_second = ROUND(
    (
      LENGTH(inSecondList) - LENGTH(REPLACE (inSecondList, ',', ''))
    ) / LENGTH(',')
  ) + 1 ;
  SET idsInFirstList = (CHAR_LENGTH(inFirstList) + 1) - CHAR_LENGTH(REPLACE(inFirstList, ',', '')) ;
  SET currentID = 1 ;
  -- Loop over inFirstList, and for each element that is in inSecondList increment numReturn
  firstListLoop :
  REPEAT
    SET currentListItem = SUBSTRING_INDEX(
      SUBSTRING_INDEX(inFirstList, ',', currentID),
      ',',
      - 1
    ) ;
    IF FIND_IN_SET(currentListItem, inSecondList) 
    THEN SET numReturn = numReturn + 1 ;
    END IF ;
    SET currentID = currentID + 1 ;
    UNTIL currentID > idsInFirstList 
  END REPEAT firstListLoop ;
  IF total_values_in_second = numReturn 
  THEN RETURN 1 ;
  ELSE RETURN 0 ;
  END IF ;
END $$

DELIMITER ;

Fiddle Demo

Upvotes: 0

GolezTrol
GolezTrol

Reputation: 116180

FIND_IN_SET should do the trick, but the first value is an individual value and doesn't work right if it contains a comma. You would have to look for each individual value:

SELECT 
  FIND_IN_SET('a', 'b,c,a,d') AND
  FIND_IN_SET('b', 'b,c,a,d') AND
  FIND_IN_SET('c', 'b,c,a,d') AND
  FIND_IN_SET('d', 'b,c,a,d')

If you don't have these separate values available, maybe you can split the input value into multiple values. The answers to the question 'Split values to multiple rows' might give you some inspiration.

The better solution would be not to store comma separated values at all. It's considered bad practice.

Upvotes: 3

Related Questions