Given is a mySQL table named "orders_products" with the following relevant fields:
Both fields are indexed.
I am running the following query:
SELECT products_id, count( products_id ) AS counter
FROM orders_products
WHERE orders_id
IN (
SELECT DISTINCT orders_id
FROM orders_products
WHERE products_id = 85094
)
AND products_id != 85094
GROUP BY products_id
ORDER BY counter DESC
LIMIT 4
This query takes extremely long, around 20 seconds. The database is not very busy otherwise, and performs well on other queries.
I am wondering, what causes the query to be so slow?
The table is rather big (around 1,5 million rows, size around 210 mb), could this be a memory issue?
Is there a way to tell exactly what is taking mySQL so long?
Output of Explain:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY orders_products range products_id products_id 4 NULL 1577863 Using where; Using temporary; Using filesort
2 DEPENDENT SUBQUERY orders_products ref orders_id,products_id products_id 4 const 2 Using where; Using temporary
Queries that use WHERE ID IN (subquery)
perform notoriously badly with mysql.
With most cases of such queries however, it is possible to rewrite them as a JOIN
, and this one is no exception:
SELECT
t2.products_id,
count(t2.products_id) AS counter
FROM orders_products t1
JOIN orders_products t2
ON t2.orders_id = t1.orders_id
AND t2.products_id != 85094
WHERE t1.products_id = 85094
GROUP BY t2.products_id
ORDER BY counter DESC
LIMIT 4
If you want to return rows where there are no other products (and show a zero count for them), change the join to a LEFT JOIN
.
Note how the first instance of the table has the WHERE products_id = X
, which allows index look up and immediately reduces the number of rows, and the second instance of the table has the target data, but it looked up on the id field (again fast), but filtered in the join condition to count the other products.