B-Ray
B-Ray

Reputation: 471

How To Determine If Number is Sequential In Oracle SQL

This is a second part to a previous question I'm asking. I have a table of license numbers and am trying to determine if they are legitimate. One of the tests that I'm trying to run is if the numbers are sequential/counting pattern. What would be really helpful is if I can get a max count of sequential characters, but I feel that is more difficult. For example:

 LICENSE_NUMBER   IS_COUNTING    NUM_SEQ_CHARS
 123456789        TRUE           9
 123455678        FALSE          5
 456789012        TRUE           9
 12345            TRUE           5
 123451234        FALSE          5

Thanks for your help!

Upvotes: 0

Views: 1145

Answers (3)

user5683823
user5683823

Reputation:

Here is an efficient way to determine the maximum count of consecutive increasing digits in a given license number. I didn't bother with the YES/NO column - that can be obtained trivially, for example by comparing the MAX_COUNT to length(license_number). Leaving that as an exercise.

We need to break up the license number into its individual characters, keeping track of position. Then we can subtract the "position" and take the residue modulo 10. Since Oracle believes that mod(-1, 10) is -1 and not 9 (they really need to go back to elementary school to re-learn the MODULO function), I need to add 10 so I get the correct result in all cases (explaining an "oddity" in the solution below). Now the problem becomes: Find the longest sequence of "equal" residues mod 10 for each license number.

Finding consecutive "something" in a sequence is done most efficiently with the so-called "Tabibitosan method" (the difference of two row_number()s in the CTE I called prep in my solution). Then it all becomes a matter of grouping, counting and taking a max().

with
-- begin of test data; not part of the solution
     test_data ( license_number ) as (
       select '123456789' from dual union all
       select '123455678' from dual union all
       select '456789012' from dual union all
       select '12345'     from dual union all
       select '123451234' from dual union all
       select '402023488' from dual union all
       select '4189012'   from dual
     ),
-- end of test data; solution (SQL query) continues below this line
    tokenized ( license_number, idx, res ) as (
      select license_number, level,
             mod(10 + to_number(substr(license_number, level, 1)) - level, 10)
      from   test_data
      connect by level <= length(license_number)
          and prior license_number = license_number
          and prior sys_guid() is not null
    ),
    prep ( license_number, res, grp ) as (
      select license_number, res, 
             row_number() over (partition by license_number order by idx) -
               row_number() over (partition by license_number, res order by idx)
      from   tokenized
    ),
    grouped ( license_number, res, grp, ct ) as (
      select   license_number, res, grp, count(*)
      from     prep
      group by license_number, res, grp
    )
select   license_number, max(ct) as max_count
from     grouped
group by license_number
;

Output:

LICENSE_NUMBER  MAX_COUNT
--------------  ---------
123455678               5
123456789               9
456789012               9
123451234               5
4189012                 5
12345                   5
402023488               3

Upvotes: 1

krokodilko
krokodilko

Reputation: 36127

This query also calculates a count of sequential characters:

SELECT LICENSE_NUMBER,
       CASE length( LICENSE_NUMBER ) WHEN  max( cnt) THEN 'TRUE' ELSE 'FALSE' END As IS_COUNTING ,
       max( cnt) As NUM_SEQ_CHARS
FROM (
      SELECT LICENSE_NUMBER, p, count(*) As cnt
      FROM (
            SELECT LICENSE_NUMBER,
                   SUM( xx ) OVER ( Partition By LICENSE_NUMBER ORDER BY x ) As p
            FROM (
                  SELECT LICENSE_NUMBER,
                         x, qq,
                         CASE WHEN qq - 1 = LAG( qq ) OVER ( Partition By LICENSE_NUMBER ORDER BY x ) 
                              THEN 0 
                              WHEN qq = 0 AND 9 = LAG( qq ) OVER ( Partition By LICENSE_NUMBER ORDER BY x ) 
                              THEN 0
                              ELSE 1 
                          END As xx
                  FROM (
                      SELECT LICENSE_NUMBER, x, substr( LICENSE_NUMBER, x, 1 ) qq
                      FROM (
                          SELECT *
                          FROM table
                          CROSS JOIN (
                             SELECT level x FROM dual
                             CONNECT BY LEVEL <= ( SELECT MAX( length( LICENSE_NUMBER )) FROM table )
                          )
                      )
                  )
            )
       )
       GROUP BY LICENSE_NUMBER, p
)
GROUP BY LICENSE_NUMBER

Upvotes: 0

Markus Jarderot
Markus Jarderot

Reputation: 89231

^(01(2345678901)*(2(3(4(5(6(7(8(90?)?)?)?)?)?)?)?)?
 |12(3456789012)*(3(4(5(6(7(8(9(01?)?)?)?)?)?)?)?)?
 |23(4567890123)*(4(5(6(7(8(9(0(12?)?)?)?)?)?)?)?)?
 |34(5678901234)*(5(6(7(8(9(0(1(23?)?)?)?)?)?)?)?)?
 |45(6789012345)*(6(7(8(9(0(1(2(34?)?)?)?)?)?)?)?)?
 |56(7890123456)*(7(8(9(0(1(2(3(45?)?)?)?)?)?)?)?)?
 |67(8901234567)*(8(9(0(1(2(3(4(56?)?)?)?)?)?)?)?)?
 |78(9012345678)*(9(0(1(2(3(4(5(67?)?)?)?)?)?)?)?)?
 |89(0123456789)*(0(1(2(3(4(5(6(78?)?)?)?)?)?)?)?)?
 |90(1234567890)*(1(2(3(4(5(6(7(89?)?)?)?)?)?)?)?)?)$

It will match any string of two or more digits, that consists entirely of sequential digits.

123456789     MATCH
123455678     NOT MATCH
456789012     MATCH
12345         MATCH

Demo: http://regexr.com/3enkb

Upvotes: 1

Related Questions