timurkins
timurkins

Reputation: 1

SQL Count first occurrence

My original database looks like that:

TYPE CONTRACT_ID
a    101011 
c    101012
b    101011
b    101012
a    101011-1
c    101012

I am trying to get data, grouped by TYPE, counting unique CONTRACT_ID, but some contracts have subcontracts, like 101011 has subcontract 101011-1. All of them have to be counted as one contract.

I have tried distinct and it works but only partially because those subcontracts still being counted as unique entrees.

SELECT TYPE, count(distinct CONTRACT_ID) as countocc
FROM db_address
group by TYPE

I expect output like that:

TYPE  countocc
a     1 
b     2
c     1

Upvotes: 0

Views: 983

Answers (3)

Barbaros Özhan
Barbaros Özhan

Reputation: 65363

The logic might be extracting the part of the string upto the dash character, if exists, and then grouping by type column. But method differs depending on the DBMS.

If you're using Oracle, consider :

select type, 
       count( distinct
             case when instr(contract_id,'-') > 0 then
                  substr(contract_id,1,instr(contract_id,'-')-1)
             else
                  contract_id
             end) as countocc
  from db_address d
 group by type 

If SQL Server, then consider :

select type, 
       count( distinct
             case when charindex('-',contract_id) > 0 then
                  left(contract_id,charindex('-',contract_id)-1)
             else
                  contract_id
             end) as countocc
  from db_address d
 group by type;

If MySQL, then consider :

select type, 
       count(distinct substring(contract_id,1,instr(contract_id,'-')-1)) as countocc
  from db_address d
 group by type;

If PostGRES , then consider :

select type, 
       count( distinct
             case when strpos(contract_id,'-') > 0 then
                  substr(contract_id,1,strpos(contract_id,'-')-1)
             else
                  contract_id
             end) as countocc
  from db_address d
 group by type;

Upvotes: 0

forpas
forpas

Reputation: 164139

Use a CASE statement to count only the part of contract_id before the '-' (if it exists):

select 
  type,
  count(distinct 
    case 
      when contract_id like '%-%' then 
        substring(contract_id, 1, instr(contract_id, '-') - 1)
      else contract_id
    end
  ) counter
from db_address
group by type

This covers the case (if there is such a case) that a subcontract is in the table but not the main contract.
The code works for MySql but all the functions used can be found in any rdbms.
See the demo.
Results:

| type | counter |
| ---- | ------- |
| a    | 1       |
| b    | 2       |
| c    | 1       |

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1270463

How about ignoring the subcontracts all-together? You seem to have the parent contract when you have the subs:

SELECT TYPE, count(distinct CONTRACT_ID) as countocc
FROM db_address
WHERE CONTRACT_ID NOT LIKE '%-%'
GROUP BY TYPE;

Upvotes: 1

Related Questions