ChrisCode
ChrisCode

Reputation: 29

Mysql: Finding longest character sequence in a string

In mySQL, how can I find the length of the longest sequence of a given character? For example, in the following string

1325******2h3n***3k2n*

If I were looking for the * character, the result should be 6 because the chain of 6 * characters is the longest present in the string.

Upvotes: 1

Views: 223

Answers (3)

Matt Raines
Matt Raines

Reputation: 4218

Dylan Su's solution is clever and works well if you know the maximum number of characters is small or don't want the overhead of building a function.

On the other hand one of the following function definitions will work regardless of character length without having to add new UNION statements indefinitely.

This function loops over each of the characters in the string, and if they match the repeat character, increments a length counter. It then returns the max length.

DELIMITER //
CREATE FUNCTION LONGEST_CHARACTER_SEQUENCE(input VARCHAR(255), repeat_character CHAR(1))
RETURNS TINYINT UNSIGNED DETERMINISTIC NO SQL
BEGIN
    DECLARE max_length TINYINT UNSIGNED DEFAULT 0;
    DECLARE length TINYINT UNSIGNED DEFAULT 0;
    DECLARE in_sequence BOOLEAN DEFAULT 0;
    DECLARE position INT DEFAULT 1;
    WHILE position <= LENGTH(input) DO
        IF SUBSTRING(input, position, 1) = repeat_character THEN
            IF in_sequence THEN
                SET length = length + 1;
            ELSE
                SET length = 1;
            END IF;
            IF length > max_length THEN
                SET max_length = length;
            END IF;
            SET in_sequence = 1;
        ELSE
            SET in_sequence = 0;
        END IF;
        SET position = position + 1;
    END WHILE;
    RETURN max_length;
END//
DELIMITER ;

SELECT LONGEST_CHARACTER_SEQUENCE('1325******2h3n***3k2n*', '*');
-- Returns: 6

Inspired by Dylan Su's answer, this function increments a length counter until INSTR no longer returns true. I think it's simpler.

DELIMITER //
CREATE FUNCTION LONGEST_CHARACTER_SEQUENCE(input VARCHAR(255), repeat_character CHAR(1))
RETURNS TINYINT UNSIGNED DETERMINISTIC NO SQL
BEGIN
    DECLARE length TINYINT UNSIGNED DEFAULT 0;
    WHILE INSTR(input, REPEAT(repeat_character, length + 1)) DO
        SET length = length + 1;
    END WHILE;
    RETURN length;
END//
DELIMITER ;

SELECT LONGEST_CHARACTER_SEQUENCE('1325******2h3n***3k2n*', '*');
-- Also returns: 6

Upvotes: 0

KP.
KP.

Reputation: 393

what your looking for is basically the length of the longest substring,

you can find the algorithm for it here

Trying to achieve this with a query would not be such a good idea,

I suggest, using a stored procedure instead.

Upvotes: 0

Dylan Su
Dylan Su

Reputation: 6065

You can use instr and and generated table with UNION to get it.

-- This query can find up to 10. If more need, need to update the `UNION`.

select max((instr('1325*****2h3n***3k2n*',repeat('*', times)) != 0) * times ) longest_seq
from (select 1 times union select 2 union select 3 union select 4 union select 5 
    union select 6 union select 7 union select 8 union select 9 union select 10) t;

Demo:

mysql> select max((instr('1325*****2h3n***3k2n*',repeat('*', times)) != 0) * times ) longest_seq
    -> from (select 1 times union select 2 union select 3 union select 4 union select 5
    -> union select 6 union select 7 union select 8 union select 9 union select 10) t;
+-------------+
| longest_seq |
+-------------+
|           5 |
+-------------+
1 row in set (0.01 sec)

Upvotes: 1

Related Questions