Manju Prabhu
Manju Prabhu

Reputation: 462

SQL Query Performance with count

I have 2 tables, COMPANY and EMPLOYEE.

COMPANY_ID is the primary key of the COMPANY table and foreign key for EMPLOYEE table. The COMPANY_ID is a 10 digit number. We are generate a 3 number combination and query the database.

The select statement has regex to bulk load the company based on COMPANY_ID. The query is executed multiple times with different patterns

i.e.

regexp_like(COMPANY_ID, '^(000|001|002|003|004|005|006|007|008|009)') .

Existing query looks something like this -

select * 
from COMPANY company
where regexp_like(company.COMPANY_ID, '^(000|001|002|003|004|005|006|007|008|009)')

The new requirement is to retrieve the company information along with the employee count. For example if a company has 10 employees, then the query should return all the columns of the COMPANY table, along with employee count i.e. 10

This is the select statement that I came up with -

select 
    nvl(count_table.cont_count, 0), company.* 
from 
    COMPANY company,
    (select company.COMPANY_ID, count(company.COMPANY_ID) as cont_count 
     from COMPANY company, EMPLOYEE employee
     where regexp_like(company.COMPANY_ID, '^(000|001|002|003|004|005|006|007|008|009)') 
       and company.CONTACT_ID = employee.CONTACT_ID     
     group by (company.COMPANY_ID)) count_table
 where 
     regexp_like(company.COMPANY_ID, '^(000|001|002|003|004|005|006|007|008|009)') 
     and count_table.COMPANY_ID(+)= company.COMPANY_ID

Above query works, but it takes double the time compared to the previous statement. Is there a better way to retrieve the employee count?

Note: Oracle database is in use.

Upvotes: 0

Views: 94

Answers (2)

Patrick Bacon
Patrick Bacon

Reputation: 4660

The derived table does not add value, thus I would get rid of it and use a scalar query (because I do not know all of your columns in the company table to properly do a group by):

select c.*,
  nvl(
  (select count(1)
  from employee emp
  where emp.company_id = c.company_id
  ),0) employee_count
from company c
where regexp_like(c.company_id, '^(000|001|002|003|004|005|006|007|008|009)')

Also, if performance is still an issue, I would consider modifying your where statement to not use a regexp.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Addendum ~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I see that the question explicitly identifies that the employee table has company_id as a foreign key. Since this is clarified, I am removing this statement:

The data model for these tables is not intuitive (would you not have company_id as a foreign key in the employees table?).

Upvotes: 1

Tony Andrews
Tony Andrews

Reputation: 132750

You don't need to execute that expensive REGEXP_LIKE twice:

 select nvl(count_table.cont_count,0),company.* 
 from COMPANY company 
 ,( select   employee.COMPANY_ID, count(employee.COMPANY_ID) as cont_count
    from     EMPLOYEE employee
     group by (employee.COMPANY_ID)
  ) count_table
  where regexp_like(company.COMPANY_ID, '^(000|001|002|003|004|005|006|007|008|009)') 
  and count_table.COMPANY_ID(+)= company.COMPANY_ID

Or you could use a scalar subquery:

select company.* 
,      (select count(*)
        from   employee e
        where  e.company_id = c.company_id
       )
from COMPANY c 
where regexp_like(c.COMPANY_ID, '^(000|001|002|003|004|005|006|007|008|009)') 

And personally I would ditch the slow REGEXP_LIKE for something like:

where substr(c.company_id,1,3) between '000' and '009'

Upvotes: 2

Related Questions