Is there a generalized procedure or algorithm for transforming a SQL subquery into a join, or vice versa? That is, is there a set of typographic operations that can be applied to a syntactically correct SQL query statement containing a subquery that results in a functionally equivalent statement without a subquery? If so, what are they (i.e., what's the algorithm), and in what cases do they not apply?
Converting a subquery into a JOIN can be pretty straightforward:
IN
clause FROM TABLE_X x
WHERE x.col IN (SELECT y.col FROM TABLE_Y y)
...can be converted to:
FROM TABLE_X x
JOIN TABLE_Y y ON y.col = x.col
Your JOIN criteria is where you have direct comparison.
EXISTS
clauseBut there are complications when you look at the EXISTS
clause. EXISTS are typically correllated, where the subquery is filtered by criteria from the table(s) outside the subquery. But the EXISTS is only for returning a boolean based on the criteria.
FROM TABLE_X x
WHERE EXISTS (SELECT NULL
FROM TABLE_Y y
WHERE y.col = x.col)
...converted:
FROM TABLE_X x
JOIN TABLE_Y y ON y.col = x.col
Because of the boolean, there's a risk of more rows turning up in the resultset.
SELECT
s in the SELECT clauseThese should always be changed, with prejudice:
SELECT x.*,
(SELECT MAX(y.example_col)
FROM TABLE_Y y
WHERE y.col = x.col)
FROM TABLE_X x
You're probably noticing a patter now, but I made this a little different for an inline view example:
SELECT x.*,
z.mc
FROM TABLE_X x
JOIN (SELECT y.col, --inline view within the brackets
MAX(y.example_col) 'mc'
FROM TABLE_Y y
GROUP BY y.col) z ON z.col = x.col
The key is making sure the inline view resultset includes the column(s) needed to join to, along with the columns.
LEFT JOIN
sYou might've noticed I didn't have any LEFT JOIN examples - this would only be necessary if columns from the subquery use NULL testing (COALESCE
on almost any db these days, Oracle's NVL
or NVL2
, MySQLs IFNULL
, SQL Server's ISNULL
, etc...):
SELECT x.*,
COALESCE((SELECT MAX(y.example_col)
FROM TABLE_Y y
WHERE y.col = x.col), 0)
FROM TABLE_X x
Converted:
SELECT x.*,
COALESCE(z.mc, 0)
FROM TABLE_X x
LEFT JOIN (SELECT y.col,
MAX(y.example_col) 'mc'
FROM TABLE_Y y
GROUP BY y.col) z ON z.col = x.col
I'm not sure if that will satisfy your typographic needs, but hope I've demonstrated that the key is determining what the JOIN criteria is. Once you know the column(s) involved, you know the table(s) involved.