Subqueries with EXISTS vs IN - MySQL

Techie picture Techie · Jan 7, 2013 · Viewed 43k times · Source

Below two queries are subqueries. Both are the same and both works fine for me. But the problem is Method 1 query takes about 10 secs to execute while Method 2 query takes under 1 sec.

I was able to convert method 1 query to method 2 but I don't understand what's happening in the query. I have been trying to figure it out myself. I would really like to learn what's the difference between below two queries and how does the performance gain happen ? what's the logic behind it ?

I'm new to these advance techniques. I hope someone will help me out here. Given that I read the docs which does not give me a clue.

Method 1 :

SELECT
   *       
FROM
   tracker       
WHERE
   reservation_id IN (
      SELECT
         reservation_id                                 
      FROM
         tracker                                 
      GROUP  BY
         reservation_id                                 
      HAVING
         (
            method = 1                                          
            AND type = 0                                          
            AND Count(*) > 1 
         )                                         
         OR (
            method = 1                                              
            AND type = 1                                              
            AND Count(*) > 1 
         )                                         
         OR (
            method = 2                                              
            AND type = 2                                              
            AND Count(*) > 0 
         )                                         
         OR (
            method = 3                                              
            AND type = 0                                              
            AND Count(*) > 0 
         )                                         
         OR (
            method = 3                                              
            AND type = 1                                              
            AND Count(*) > 1 
         )                                         
         OR (
            method = 3                                              
            AND type = 3                                              
            AND Count(*) > 0 
         )
   )

Method 2 :

SELECT
   *                                
FROM
   `tracker` t                                
WHERE
   EXISTS (
      SELECT
         reservation_id                                              
      FROM
         `tracker` t3                                              
      WHERE
         t3.reservation_id = t.reservation_id                                              
      GROUP BY
         reservation_id                                              
      HAVING
         (
            METHOD = 1 
            AND TYPE = 0 
            AND COUNT(*) > 1
         ) 
         OR                                                     
         (
            METHOD = 1 
            AND TYPE = 1 
            AND COUNT(*) > 1
         ) 
         OR                                                    
         (
            METHOD = 2 
            AND TYPE = 2 
            AND COUNT(*) > 0
         ) 
         OR                                                     
         (
            METHOD = 3 
            AND TYPE = 0 
            AND COUNT(*) > 0
         ) 
         OR                                                     
         (
            METHOD = 3 
            AND TYPE = 1 
            AND COUNT(*) > 1
         ) 
         OR                                                     
         (
            METHOD = 3 
            AND TYPE = 3 
            AND COUNT(*) > 0
         )                                             
   )

Answer

bonCodigo picture bonCodigo · Jan 7, 2013

An Explain Plan would have shown you why exactly you should use Exists. Usually the question comes Exists vs Count(*). Exists is faster. Why?

  • With regard to challenges present by NULL: when subquery returns Null, for IN the entire query becomes Null. So you need to handle that as well. But using Exist, it's merely a false. Much easier to cope. Simply IN can't compare anything with Null but Exists can.

  • e.g. Exists (Select * from yourtable where bla = 'blabla'); you get true/false the moment one hit is found/matched.

  • In this case IN sort of takes the position of the Count(*) to select ALL matching rows based on the WHERE because it's comparing all values.

But don't forget this either:

  • EXISTS executes at high speed against IN : when the subquery results is very large.
  • IN gets ahead of EXISTS : when the subquery results is very small.

Reference to for more details: