Oracle USING clause best practice

RAY picture RAY · Apr 23, 2013 · Viewed 26k times · Source

Disclaimer: I'm a developer and not a DBA.

I've been a huge fan of the USING clause in Oracle since I accidentally stumbled upon it and have used it in place of the old-fashioned ON clause to join fact tables with dimension tables ever since. To me, it creates a much more succinct SQL and produces a more concise result set with no unnecessary duplicated columns.

However, I was asked yesterday by a colleague to convert all my USING clauses into ONs. I will check with him and ask him what his reasons are. He works much more closely with the database than I do, so I assume he has some good reasons.

I have not heard back from him (we work in different timezones), but I wonder if there are any guidelines or best practices regarding the use of the "using" clause? I've googled around quite a bit, but have not come across anything definitive. In fact, I've not even even a good debate anywhere.

Can someone shed some light on this? Or provide a link to a good discussion on the topic?

Thank you!

Answer

Alex Poole picture Alex Poole · Apr 23, 2013

You're presumably already aware of the distinction, but from the documentation:

ON condition Use the ON clause to specify a join condition. Doing so lets you specify join conditions separate from any search or filter conditions in the WHERE clause.

USING (column) When you are specifying an equijoin of columns that have the same name in both tables, the USING column clause indicates the columns to be used. You can use this clause only if the join columns in both tables have the same name. Within this clause, do not qualify the column name with a table name or table alias.

So these would be equivalent:

select e.ename, d.dname
from emp e join dept d using (deptno);

select e.ename, d.dname
from emp e join dept d on d.deptno = e.deptno;

To a large extent which you use is a matter of style, but there are (at least) two situations where you can't use using: (a) when the column names are not the same in the two tables, and (b) when you want to use the joining column:

select e.ename, d.dname, d.deptno
from emp e join dept d using(deptno);

select e.ename, d.dname, d.deptno
                         *
ERROR at line 1:
ORA-25154: column part of USING clause cannot have qualifier

You can of course just leave off the qualifier and select ..., deptno, as long as you don't have another table with the same column that isn't joined using it:

select e.ename, d.dname, deptno
from emp e join dept d using (deptno) join mytab m using (empno);

select e.ename, d.dname, deptno
                         *
ERROR at line 1:
ORA-00918: column ambiguously defined

In that case you can only select the qualified m.deptno. (OK, this is rather contrived...).

The main reason I can see for avoiding using is just consistency; since you sometimes can't use it, occasionally switching to on for those situations might be a bit jarring. But again that's more about style than any deep technical reason.

Perhaps your colleague is simply imposing (or suggesting) coding standards, but only they will know that. It also isn't quite clear if you're being asked to change some new code you've written that is going through review, or old code. If it's the latter then regardless of the reasons for them preferring on, I think you'd need to get a separate justification for modifying proven code, as there's a risk of introducing new problems even when the modified code is retested - quite apart from the cost/effort involved in the rework and retesting.

A couple of things strike me about your question though. Firstly you describes the on syntax as 'old-fashioned', but I don't think that's fair - both are valid and current (as of SQL:2011 I think, but citation needed!). And this:

produces a more concise result set with no unnecessary duplicated columns.

... which I think suggests you're using select *, otherwise you would just select one of the values, albeit with a couple of extra characters for the qualifier. Using select * is generally considered bad practice (here for example) for anything other than ad hoc queries and some subqueries.