ANisus
ANisus

Reputation: 78095

Aggregate bitfield values with binary OR

I have a table with int values being used as bitfields (where each bit is a flag).

Now I would like to aggregate them with a binary operation (in my case OR) so that:

SELECT 1 AS bitfield
INTO #TABLE
UNION ALL SELECT 1 + 2 + 8 + 32
UNION ALL SELECT 2 + 128
UNION ALL SELECT 2 + 32

SELECT AND_AGGR(bitfield) -- Invalid because AND_AGGR doesn't exist
FROM #TABLE

DROP #TABLE

would result in the value 171

What would be a good way to do this that hopefully doesn't require a lot of | and MAX (but if you must, you must)?

I am using MS SQL Server 2008 myself, but solutions on other servers are also of interest.

Upvotes: 1

Views: 1892

Answers (3)

Kalju Pärn
Kalju Pärn

Reputation: 31

In MS SQL Server

DECLARE @agg  VARCHAR(MAX) = '0001,0011,0101,0101,0101'
SELECT CONVERT(binary(4), VALUE, 2) , VALUE  FROM STRING_SPLIT( @agg , ',')

DECLARE @sum AS BIGINT = 0
DECLARE @mul AS BIGINT = 0xffffffff
SELECT  @sum |= v
   , @mul &= v
FROM STRING_SPLIT( @agg , ',')
CROSS APPLY (VALUES (CONVERT(binary(4), VALUE, 2))) _(v)

PRINT FORMAT(@sum,'X8')
PRINT FORMAT(@mul,'X8')

Prints

            VALUE
---------- ------------
0x00010000 0001
0x00110000 0011
0x01010000 0101
0x01010000 0101
0x01010000 0101

01110000
00010000

In more complex word you need:

CREATE OR ALTER FUNCTION dbo.BOR( @agg VARCHAR(MAX))
RETURNS BIGINT
AS 
BEGIN
DECLARE @sum AS BIGINT = 0
SELECT  @sum |= CONVERT(BIGINT, VALUE)
FROM STRING_SPLIT( @agg , ',')
RETURN @sum
END

GO
CREATE OR ALTER FUNCTION dbo.BAND( @agg VARCHAR(MAX))
RETURNS BIGINT
AS 
BEGIN
DECLARE @mul AS BIGINT = 0xffffffffffffffff
SELECT  @mul &= CONVERT(BIGINT, VALUE)
FROM STRING_SPLIT( @agg , ',')
RETURN @mul
END
GO

when using bitmap of payment periods

;WITH delayedPayment AS
(SELECT * FROM ( VALUES 
    ( 123, 67, '2020-2-1')
   ,( 123, 67, '2020-4-1')
   ,( 123, 67, '2020-5-1')
   ,( 123, 67, '2020-6-1')
   ,( 123, 68, '2020-6-1')  -- another agreement
   ,( 123, 67, '2020-12-1')
           
   ,( 456, 69, '2020-4-1')
   ,( 456, 69, '2020-8-1')
   ,( 456, 69, '2020-10-1')
   ,( 456, 69, '2020-11-1')) _(cuno, loan, missedDuedate)
)
, delayPattern AS
(SELECT cuno
   ,  sum_months
   ,  bor_months
   ,  IIF( FORMAT( CAST(bor_months AS BIGINT), 'X16') LIKE '%111%', 'dalyad 3+ month in row', NULL) delayState
   FROM (SELECT cuno
         , SUM(POWER( 16.0, CONVERT( BIGINT, DATEDIFF( month, missedDuedate, '2020-12-1')))) sum_months
         , dbo.BOR( STRING_AGG( CONVERT( BIGINT, POWER( 16.0, DATEDIFF( month, missedDuedate, '2020-12-1'))),',')) bor_months
      FROM delayedPayment
      GROUP BY cuno
   ) s
)
SELECT cuno
   ,  FORMAT( CAST(sum_months AS BIGINT), 'X16') sum_months
   ,  FORMAT( CAST(bor_months AS BIGINT), 'X16') bor_months
   ,  delayState
FROM delayPattern

cuno    sum_months          bor_months          delayState
123     00000*10112*000001  00000*10111*000001  dalyad 3+ month in row
456     0000000100010110    0000000100010110    NULL

But sometimes just one need to think and you can do it with SUM

, delayPattern AS -- optimal
(SELECT cuno
   ,  bor_months
   ,  IIF( FORMAT( CAST(bor_months AS BIGINT), 'X16') LIKE '%111%', 'dalyad 3+ month in row', NULL) delayState
   FROM (SELECT cuno
         , SUM(POWER( 16.0, missedmonth)) bor_months
      FROM ( SELECT DISTINCT cuno
               , missedmonth
            FROM delayedPayment
            CROSS APPLY (VALUES ( DATEDIFF( month, missedDuedate, '2020-12-1'))) _(missedmonth)
            GROUP BY cuno, missedmonth
            ) ss
      GROUP BY cuno
   ) s
)

SELECT cuno
   ,  FORMAT( CAST(bor_months AS BIGINT), 'X16') bor_months
   ,  delayState
FROM delayPattern

Will print

cuno    bor_months  delayState
123 0000010111000001    dalyad 3+ month in row
456 0000000100010110    NULL

NOTE: I am using HEX format and POWER(16.0, X) , just to be lazy, POWER(2.0, X) will be correct, but then you need bin->string formatter. Something like this:

CREATE OR ALTER FUNCTION dbo.toBinaryString(@p INT)
RETURNS VARCHAR(24)
AS 
BEGIN 
RETURN  REVERSE(REPLACE( REPLACE( 
    REPLACE( REPLACE( REPLACE( REPLACE( 
    REPLACE( REPLACE( REPLACE( REPLACE( 
    REPLACE( REPLACE( REPLACE( REPLACE( 
    REPLACE( REPLACE( REPLACE( REPLACE( FORMAT(@p,'X8'), 
        '0', '....'), '1', '...x'),'2', '..x.'),'3', '..xx'),
        '4', '.x..'), '5', '.x.x'),'6', '.xx.'),'7', '.xxx'),
        '8', 'x...'), '9', 'x..x'),'A', 'x.x.'),'B', 'x.xx'),
        'C', 'xx..'), 'D', 'xx.x'),'E', 'xxx.'),'F', 'xxxx'),
        '.','0'),'x','1'))
END

Upvotes: 0

Ed Harper
Ed Harper

Reputation: 21505

If you're expecting the result 171, surely you mean binary OR not AND?

In any case, this solution aggregates the values into a variable:

SELECT 1 AS bitfield
INTO #TABLE
UNION ALL SELECT 1 + 2 + 8 + 32
UNION ALL SELECT 2 + 128
UNION ALL SELECT 2 + 32

DECLARE @i int = 0

SELECT @i = @i | bitfield
FROM #TABLE

SELECT @i

DROP TABLE  #table

This might not meet your requirements if you want to group the aggregation by another field.

It is also unlikely to perform well on a large table.

Upvotes: 1

Mark Byers
Mark Byers

Reputation: 839254

On MySQL and PostgreSQL you can use BIT_OR.

I don't think SQL Server has this aggregate function.

You could do it with lots of MAX and & as you said:

MAX(x & 1) + MAX(x & 2) + ... + MAX(x & 128)

Upvotes: 3

Related Questions