Reputation: 3849
Suppose we have the following data structure:
s_cui o_cui year
D000001 D000001 1975
D000001 D000001 1976
D000001 D002413 1976
D000001 D002413 1979
D000001 D002413 1987
D000001 D004298 1976
D000002 D000002 1985
D000003 D000900 1975
D000003 D000900 1990
D000003 D004134 1983
D000003 D004134 1986
I need to count duplicate pairs (in the 1st and 2nd column) and to each such pair assign the lowest value from the 3rd column. For this example the output should be:
s_cui o_cui freq year
D000001 D000001 2 1975
D000001 D002413 3 1976
D000001 D004298 1 1976
D000002 D000002 1 1985
D000003 D000900 2 1975
D000003 D004134 2 1983
My initial attempt is here (without year
field):
SELECT s_cui, o_cui, COUNT(*) FROM table GROUP BY s_cui, o_cui;
Upvotes: 0
Views: 81
Reputation: 35353
You're close! You just need to add min(year).
It looks like you want the earliest occurrence of a year for each s_cui, o_cui and count.
SELECT s_cui, o_cui, COUNT(*) freq, min(`year`) as `year`
FROM table
GROUP BY s_cui, o_cui;
Min(year) will return the earliest year for each pairing of s_cui, and o_cui while still retaining the counts. I put ` around the year as it is a reserved word in mySQL.
though I don't see how "pairing" takes effect more of each "group"
Upvotes: 1