Reputation: 542
I wasn't sure quite what to call this problem but it's not exactly counting rows. Let's say we have the relation:
Competition(compId, sport, playerName, medal)
And let's say the attribute medal can be either gold, silver, bronze, or null. So we have the following data:
(193, Tennis, John Doe, Gold)
(931, Skiing, Mary White, Bronze)
(193, Tennis, Arnold Black, null)
(182, Bobsledding, John Doe, Gold)
(901, Ping-Pong, Adam Brown, Silver)
(248, Bobsledding, Mary White, Silver)
I am having a very hard time figuring out how to answer this question: Get the names of all players who have won more than one medal. In this data the answers would be John Doe and Mary White. How could I get that answer on arbitrary data for this relation using relational algebra?
(This is a simplified version of the actual homework problem, and this simplification represents (I hope) the part of that problem I'm struggling with. There are an arbitrary and unknown number of competitions, sports, and players, but only 4 possibilities for medal)
Upvotes: 7
Views: 37735
Reputation: 15
I have a similar problem. I resolved thinking if I match every person with himself, and then filter only when the attribute is different, I get the people with more that one value for that attribute, in this case, medal.
Player1 = π compId,playerName,medal (ρ Player1 (Competition))
Player2 = π compId,playerName,medal (ρ Player2 (Competition))
π Player1.playerName (σ Player1.medal<>Player.medal (Player1 ⨝ Player1.compId=Player2.compId Player2))
Upvotes: 0
Reputation: 614
I know this is a very old post, but I'm learning this subject, and I did find another way to represent what the exercise asks. Maybe it can be usefull for someone searching for this.
The answer:
π Lc.playerName (
ρ Lc σ medal ≠ 'null' Competition
⨝ Lc.playerName = Rc.playerName AND Lc.sport ≠ Rc.sport
ρ Rc σ medal ≠ 'null' Competition)
I'm assuming that nobody can have two medals in the same sport, if somone have more than one medal, it has to be in different sports. I'm also assuming that each tuple in the relation represents an awarding, so ...
The expression above manages to get the playerName that appears more than once for different sports.
Edited: I treated null... you can see it in action with this tool https://dbis-uibk.github.io/relax/calc.htm
to set the data, use this this code in the Group Editor tab
group:Competition
Competition = {
compId:number, sport:string , playerName:string, medal:string
193 , 'Tennis' , 'John Doe' , 'Gold'
931 , 'Skiing' , 'Mary White' , 'Bronze'
193 , 'Tennis' , 'Arnold Black' , 'null'
182 , 'Bobsledding', 'John Doe' , 'Gold'
901 , 'Ping-Pong' , 'Adam Brown' , 'Silver'
248 , 'Bobsledding', 'Mary White' , 'Silver'
}
Upvotes: 0
Reputation: 15118
Get the names of all players who have won more than one medal.
(It's not clear what this means. Have won than one kind of medal? Or have received more than one medal? Your example answer suggests the latter. Also, it treats "null" as just another kind of medal, not specially as in SQL.)
-- rows where
THERE EXISTS compId,sport,medal,compId1,compId2,medal2 SUCH THAT
in competition [compId] of sport [sport] player [playerName] won [medal]
AND in competition [compId2] of sport [sport2] player [playerName] won [medal2]
AND (compId <> compId2 OR sport <> sport2 OR medal <> medal2)
Using statement shorthand:
-- rows where
THERE EXISTS compId,sport,medal,compId1,compId2,medal2 SUCH THAT
Competition(compId, sport, playerName, medal)
AND Competition(compId2, sport2, playerName, medal2)
AND (compId <> compId2 OR sport <> sport2 OR medal <> medal2)
Rearranging (anticipating the limitations of one comparison per σ and one attribute set per ∪):
-- rows where
THERE EXISTS compId,sport,medal,compId1,compId2,medal2 SUCH THAT
( Competition(compId, sport, playerName, medal)
AND Competition(compId2, sport2, playerName, medal2)
AND compId <> compId2)
OR ( Competition(compId, sport, playerName, medal)
AND Competition(compId2, sport2, playerName, medal2)
AND sport <> sport2)
OR ( Competition(compId, sport, playerName, medal)
AND Competition(compId2, sport2, playerName, medal2)
AND medal <> medal2)
Now to get the algebra replace:
every column/attribute renaming by ρ (rename).
π playerName (
σ compId <> compId2 (Competition
⋈ ρ compID2/compID ρ sport2/sport ρ medal2/medal Competition)
∪ σ sport <> sport2 (Competition
⋈ ρ compID2/compID ρ sport2/sport ρ medal2/medal Competition)
∪ σ medal <> medal2 (Competition
⋈ ρ compID2/compID ρ sport2/sport ρ medal2/medal Competition)
)
(For more see this answer.)
Upvotes: 2
Reputation: 29
There is a much simpler way to solve this problem, in my opinion:
Essentially, you find the relation of records where medal isn't null, and then join this record with itself, joining on name. The resulting records will be the ones where there are duplicate names.
Upvotes: 0