Relational Algebra equivalent of SQL "NOT IN"

jsj picture jsj · Sep 22, 2012 · Viewed 26.7k times · Source

Is there a relational algebra equivalent of the SQL expression NOT IN?

For example if I have the relation:

A1  |  A2
----------
x   |  y
a   |  b
y   |  x

I want to remove all tuples in the relation for which A1 is in A2. In SQL I might query:

SELECT
    *
FROM
    R
WHERE
    R.A1 NOT IN
        (
        SELECT
            A2
        FROM
            R
        )
/

What is really stumping me is how to subquery inside the relational algebra selection operator, is this possible?:

σsome subquery hereR

Answer

Andomar picture Andomar · Sep 22, 2012

In relational algebra, you can do this using a carthesian product. Something like:

R - ρa1,a2a11,a21A11 = A22a11,a21(R) x ρa12, a22(R))))

  • rename the columns of R, f.e. from a1 to a11 (left hand) and a12 (right hand)
  • take the cross product of the R's with renamed columns
  • select rows where a11 equals a22
  • project out a12 and a22 and keep a11 and a21
  • rename to a1 and a2

That gives you the rows that were matched. Subtract this from R to find the rows that where not matched.