Mariska
Mariska

Reputation: 1953

SQL sum values from the same table

I have a table Hobby, whose snippet is as follows:

Name    Activity    Hours

John    Hiking      .5
Sam     Cycling     .5
Sam     Swimming    1
Sam     Hiking      .5
John    Running     1
Sam     Sailing     1

For every person X in (X, Y), I would like to find the sum of hours of activities where X and Y don't have in common. For example, if John = X and Sam = Y, then it would yield 1, since Running is the only activity John has that Sam doesn't.

My code is as follows:

select a.Name, b.Name, sum(a.Hours)
from Hobby a, Hobby b
where a.Name <> b.Name and a.Activity <> b.Activity
group by a.Name, b.Name;

However, this gave me a wrong answer. What is wrong with my code?

Upvotes: 2

Views: 3973

Answers (4)

Laurence
Laurence

Reputation: 10976

Think of the underlying results before the group by. Consider only rows where table a is John, Hiking. In your query you would have

John    Hiking    0.5    John    Hiking      0.5
John    Hiking    0.5    Sam     Cycling     0.5
John    Hiking    0.5    Sam     Swimming    1
John    Hiking    0.5    Sam     Hiking      0.5
John    Hiking    0.5    John    Running     1
John    Hiking    0.5    Sam     Sailing     1

With your where clause, you'd remove the John rows and the Hiking rows from table b leaving:

John    Hiking    0.5    Sam     Cycling     0.5
John    Hiking    0.5    Sam     Swimming    1
John    Hiking    0.5    Sam     Sailing     1

So you're going to be counting these hours three times in the John, Sam set.

Here's a way to adjust it, while still only using each table once:

Select
    a.Name,
    b.Name Name2,
    Sum(a.Hours) / count(distinct b.activity) 
       - Sum(case when a.Activity = b.Activity then a.Hours else 0 end) as Hours
From
    Hobby a,
    Hobby b
Where
    a.Name != b.Name
Group By
    a.Name,
    b.Name

Example Fiddle

Upvotes: 0

David Aldridge
David Aldridge

Reputation: 52336

If you take the list of people and activities, and join to it a list of all other people, and test whether the other people also take part in that activity, it ought to do the trick.

with
  cte_unique_names as (
    select distinct name
    from   hobby)
select
  h.name participates,
  n.name does_not_participate,
  sum(hours) hours
from
  hobby h
cross join
  cte_unique_names n
where
  n.name != h.name and
  not exists (
    select null
    from   hobby h2
    where  h2.name     = n.name and
           h2.activity = h.activity)
group by
  h.name,
  n.name

Upvotes: 0

John Bingham
John Bingham

Reputation: 2006

your from clause reads

FROM Hobby a, Hobby b

Putting a comma in the from clause means "CROSS JOIN" which means each row in the first table are correlated with every row in the second table. Given your where clause I would think this gives some pretty big numbers.

your query needs to be a bit different:

select sum(hours) 
from hobby 
where name = 'John' 
  and activity not in (
    select activity 
    from hobby 
    where name = 'Sam'
  )

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269553

I find this to be a tricky question. My original approach was going to use a full outer join. But then I realized that if there is no match on the activity in one name, then I'm not going to have the name either.

So, the following query works by getting a list of all pairs of names. This is an ordered list, so a given pair of names only appears once. Then this is joined to the Hobby table twice, using left outer join to get the matches. The key, though, is that when there is no match, the row with Activity on it is still present, but with a NULL value.

The where clause finds all Activitys that have a NULL in either table. These are the ones that don't match. Then it is a simple matter of just adding up the hours:

select names.Name1, names.Name2, sum(coalesce(h1.hours, h2.hours))
from (select distinct h1.Name as name1, h2.Name as name2
      from Hobby h1 cross join Hobby h2
      where h1.Name < h2.Name
     ) names left outer join
     Hobby h1
     on names.name1 = h1.name left outer join
     Hobby h2
     on names.name2 = h2.name and
        h1.Activity = h2.Activity
where h1.Activity is null or h2.Activity is null
group by names.Name1, names.Name2;

Upvotes: 4

Related Questions