Reputation: 583
I have a table as below:
paper_id author_id author_name author_affiliation
1 521630 Ayman Kaheel Cairo Microsoft Innovation Lab
1 972575 Mahmoud Refaat Cairo Microsoft Innovation Lab
3 1528710 Ahmed Abdul-hamid Harvard
Now I am finding multiple pairs of author_id
, author_name
and author_affiliation
occuring together. For example:
author_id author_name author_affiliation count
1 Masuo Fukui <NA> 4
4 Yasusada Yamada <NA> 8
I am using the following query:
statement<-"select author_id,author_name,author_affiliation,count(*)
from paper_author
GROUP BY author_id,author_name,author_affiliation
HAVING (COUNT(*)>1)"
Now I want to know how many author_ids are present in this. I am doing this:
statement<-"select distinct author_id
from paper_author
where author_id in (
select author_id,author_name,author_affiliation,count(*)
from paper_author
GROUP BY author_id,author_name,author_affiliation
HAVING (COUNT(*)>1)
)"
I am not able to get the desired result.
Also, how can I get the number of paper ids in the above result?
Thanks.
Upvotes: 0
Views: 1242
Reputation: 94914
Here is your query slightly re-written. You don't need an IN clause. You can select directly from your result set.
select distinct author_id
from
(
select author_id
from paper_author
group by author_id,author_name,author_affiliation
having count(*) > 1
);
Upvotes: 0
Reputation: 32224
If you just want to know how many authors have more than one paper, use this query:
SELECT COUNT(*)
FROM (SELECT author_id, author_affiliation, COUNT(*)
FROM paper_author
GROUP BY author_id, author_affiliation
HAVING COUNT(*) > 1);
This assumes that author_id
is a unique identifier for author_name
. If the id selects for the author_name, author_affiliation
combination (i.e. an author producing papers for different institutions has multiple id's, one for each affiliation) then you can also strike author_affiliation
from the sub-query.
Upvotes: 0
Reputation: 1797
This will do the job, i think:
statement<-"select distinct author_id
from paper_author
where author_id in (
select author_id
from paper_author
GROUP BY author_id,author_name,author_affiliation
HAVING (COUNT(*)>1)
)"
Upvotes: 1