Why is this mySQL query extremely slow?

Majiy picture Majiy · Jul 11, 2013 · Viewed 9.7k times · Source

Given is a mySQL table named "orders_products" with the following relevant fields:

  • products_id
  • orders_id

Both fields are indexed.

I am running the following query:

SELECT products_id, count( products_id ) AS counter
FROM orders_products
WHERE orders_id
IN (
  SELECT DISTINCT orders_id
  FROM orders_products
  WHERE products_id = 85094
)
AND products_id != 85094
GROUP BY products_id
ORDER BY counter DESC
LIMIT 4

This query takes extremely long, around 20 seconds. The database is not very busy otherwise, and performs well on other queries.

I am wondering, what causes the query to be so slow?

The table is rather big (around 1,5 million rows, size around 210 mb), could this be a memory issue?

Is there a way to tell exactly what is taking mySQL so long?

Output of Explain:

id  select_type     table   type    possible_keys   key     key_len     ref     rows    Extra
1   PRIMARY     orders_products     range   products_id     products_id     4   NULL    1577863     Using where; Using temporary; Using filesort
2   DEPENDENT SUBQUERY  orders_products     ref     orders_id,products_id   products_id     4   const   2   Using where; Using temporary

Answer

Bohemian picture Bohemian · Jul 11, 2013

Queries that use WHERE ID IN (subquery) perform notoriously badly with mysql.

With most cases of such queries however, it is possible to rewrite them as a JOIN, and this one is no exception:

SELECT
    t2.products_id,
    count(t2.products_id) AS counter
FROM orders_products t1
JOIN orders_products t2
    ON t2.orders_id = t1.orders_id
    AND t2.products_id != 85094 
WHERE t1.products_id = 85094
GROUP BY t2.products_id
ORDER BY counter DESC
LIMIT 4

If you want to return rows where there are no other products (and show a zero count for them), change the join to a LEFT JOIN.

Note how the first instance of the table has the WHERE products_id = X, which allows index look up and immediately reduces the number of rows, and the second instance of the table has the target data, but it looked up on the id field (again fast), but filtered in the join condition to count the other products.