I wasn't sure quite what to call this problem but it's not exactly counting rows. Let's say we have the relation:
Competition(compId, sport, playerName, medal)
And let's say the attribute medal can be either gold, silver, bronze, or null. So we have the following data:
(193, Tennis, John Doe, Gold)
(931, Skiing, Mary White, Bronze)
(193, Tennis, Arnold Black, null)
(182, Bobsledding, John Doe, Gold)
(901, Ping-Pong, Adam Brown, Silver)
(248, Bobsledding, Mary White, Silver)
I am having a very hard time figuring out how to answer this question: Get the names of all players who have won more than one medal. In this data the answers would be John Doe and Mary White. How could I get that answer on arbitrary data for this relation using relational algebra?
(This is a simplified version of the actual homework problem, and this simplification represents (I hope) the part of that problem I'm struggling with. There are an arbitrary and unknown number of competitions, sports, and players, but only 4 possibilities for medal)
Get the names of all players who have won more than one medal.
(It's not clear what this means. Have won than one kind of medal? Or have received more than one medal? Your example answer suggests the latter. Also, it treats "null" as just another kind of medal, not specially as in SQL.)
-- rows where
THERE EXISTS compId,sport,medal,compId1,compId2,medal2 SUCH THAT
in competition [compId] of sport [sport] player [playerName] won [medal]
AND in competition [compId2] of sport [sport2] player [playerName] won [medal2]
AND (compId <> compId2 OR sport <> sport2 OR medal <> medal2)
Using statement shorthand:
-- rows where
THERE EXISTS compId,sport,medal,compId1,compId2,medal2 SUCH THAT
Competition(compId, sport, playerName, medal)
AND Competition(compId2, sport2, playerName, medal2)
AND (compId <> compId2 OR sport <> sport2 OR medal <> medal2)
Rearranging (anticipating the limitations of one comparison per σ and one attribute set per ∪):
-- rows where
THERE EXISTS compId,sport,medal,compId1,compId2,medal2 SUCH THAT
( Competition(compId, sport, playerName, medal)
AND Competition(compId2, sport2, playerName, medal2)
AND compId <> compId2)
OR ( Competition(compId, sport, playerName, medal)
AND Competition(compId2, sport2, playerName, medal2)
AND sport <> sport2)
OR ( Competition(compId, sport, playerName, medal)
AND Competition(compId2, sport2, playerName, medal2)
AND medal <> medal2)
Now to get the algebra replace:
every column/attribute renaming by ρ (rename).
π playerName (
σ compId <> compId2 (Competition
⋈ ρ compID2/compID ρ sport2/sport ρ medal2/medal Competition)
∪ σ sport <> sport2 (Competition
⋈ ρ compID2/compID ρ sport2/sport ρ medal2/medal Competition)
∪ σ medal <> medal2 (Competition
⋈ ρ compID2/compID ρ sport2/sport ρ medal2/medal Competition)
)
(For more see this answer.)