Reputation: 1
My original database looks like that:
TYPE CONTRACT_ID
a 101011
c 101012
b 101011
b 101012
a 101011-1
c 101012
I am trying to get data, grouped by TYPE, counting unique CONTRACT_ID, but some contracts have subcontracts, like 101011 has subcontract 101011-1. All of them have to be counted as one contract.
I have tried distinct and it works but only partially because those subcontracts still being counted as unique entrees.
SELECT TYPE, count(distinct CONTRACT_ID) as countocc
FROM db_address
group by TYPE
I expect output like that:
TYPE countocc
a 1
b 2
c 1
Upvotes: 0
Views: 983
Reputation: 65363
The logic might be extracting the part of the string upto the dash character, if exists, and then grouping by type
column. But method differs depending on the DBMS
.
If you're using Oracle
, consider :
select type,
count( distinct
case when instr(contract_id,'-') > 0 then
substr(contract_id,1,instr(contract_id,'-')-1)
else
contract_id
end) as countocc
from db_address d
group by type
If SQL Server
, then consider :
select type,
count( distinct
case when charindex('-',contract_id) > 0 then
left(contract_id,charindex('-',contract_id)-1)
else
contract_id
end) as countocc
from db_address d
group by type;
If MySQL
, then consider :
select type,
count(distinct substring(contract_id,1,instr(contract_id,'-')-1)) as countocc
from db_address d
group by type;
If PostGRES
, then consider :
select type,
count( distinct
case when strpos(contract_id,'-') > 0 then
substr(contract_id,1,strpos(contract_id,'-')-1)
else
contract_id
end) as countocc
from db_address d
group by type;
Upvotes: 0
Reputation: 164139
Use a CASE
statement to count only the part of contract_id
before the '-'
(if it exists):
select
type,
count(distinct
case
when contract_id like '%-%' then
substring(contract_id, 1, instr(contract_id, '-') - 1)
else contract_id
end
) counter
from db_address
group by type
This covers the case (if there is such a case) that a subcontract is in the table but not the main contract.
The code works for MySql but all the functions used can be found in any rdbms.
See the demo.
Results:
| type | counter |
| ---- | ------- |
| a | 1 |
| b | 2 |
| c | 1 |
Upvotes: 0
Reputation: 1270463
How about ignoring the subcontracts all-together? You seem to have the parent contract when you have the subs:
SELECT TYPE, count(distinct CONTRACT_ID) as countocc
FROM db_address
WHERE CONTRACT_ID NOT LIKE '%-%'
GROUP BY TYPE;
Upvotes: 1