Performance of nested select

mixkat picture mixkat · Jul 4, 2013 · Viewed 15.5k times · Source

I know this is a common question and I have read several other posts and papers but I could not find one that takes into account indexed fields and the volume of records that both queries could return.

My question is simple really. Which of the two is recommended here written in an SQL-like syntax (in terms of performance).

First query:

Select *
from someTable s
where s.someTable_id in
                    (Select someTable_id 
                     from otherTable o
                     where o.indexedField = 123)

Second query:

Select *
from someTable
where someTable_id in
                  (Select someTable_id 
                   from otherTable o
                   where o.someIndexedField = s.someIndexedField
                   and o.anotherIndexedField = 123)

My understanding is that the second query will query the database for every tuple that the outer query will return where the first query will evaluate the inner select first and then apply the filter to the outer query.

Now the second query may query the database superfast considering that the someIndexedField field is indexed but say that we have thousands or millions of records wouldn't it be faster to use the first query?

Note: In an Oracle database.

Answer

vigoncas picture vigoncas · May 11, 2014

In MySQL, if nested selects are over the same table, the execution time of the query can be hell.

A good way to improve the performance in MySQL is create a temporary table for the nested select and apply the main select against this table.

For example:

Select *
from someTable s1
where s1.someTable_id in
                    (Select someTable_id 
                     from someTable s2
                     where s2.Field = 123);

Can have a better performance with:

create temporary table 'temp_table' as (
  Select someTable_id 
  from someTable s2
  where s2.Field = 123
);

Select *
from someTable s1
where s1.someTable_id in
                    (Select someTable_id 
                     from tempTable s2);

I'm not sure about performance for a large amount of data.