SQL Server pivot vs. multiple join

Guillermo Gutiérrez picture Guillermo Gutiérrez · Sep 16, 2011 · Viewed 24.5k times · Source

What is more efficient to use in SQL Server 2005: PIVOT or MULTIPLE JOIN?

For example, I got this query using two joins:

SELECT p.name, pc1.code as code1, pc2.code as code2
FROM product p
    INNER JOIN product_code pc1
    ON p.product_id=pc1.product_id AND pc1.type=1
    INNER JOIN product_code pc2
    ON p.product_id=pc2.product_id AND pc2.type=2

I can do the same using PIVOT:

SELECT name, [1] as code1, [2] as code2
FROM (
    SELECT p.name, pc.type, pc.code
    FROM product p
        INNER JOIN product_code pc
        ON p.product_id=pc.product_id
    WHERE pc.type IN (1,2)) prods1
PIVOT(
    MAX(code) FOR type IN ([1], [2])) prods2

Which one will be more efficient?

Answer

Martin Smith picture Martin Smith · Sep 16, 2011

The answer will of course be "it depends" but based on testing this end...

Assuming

  1. 1 million products
  2. product has a clustered index on product_id
  3. Most (if not all) products have corresponding information in the product_code table
  4. Ideal indexes present on product_code for both queries.

The PIVOT version ideally needs an index product_code(product_id, type) INCLUDE (code) whereas the JOIN version ideally needs an index product_code(type,product_id) INCLUDE (code)

If these are in place giving the plans below

Plans

then the JOIN version is more efficient.

In the case that type 1 and type 2 are the only types in the table then the PIVOT version slightly has the edge in terms of number of reads as it doesn't have to seek into product_code twice but that is more than outweighed by the additional overhead of the stream aggregate operator

PIVOT

Table 'product_code'. Scan count 1, logical reads 10467
Table 'product'. Scan count 1, logical reads 4750
   CPU time = 3297 ms,  elapsed time = 3260 ms.

JOIN

Table 'product_code'. Scan count 2, logical reads 10471
Table 'product'. Scan count 1, logical reads 4750
   CPU time = 1906 ms,  elapsed time = 1866 ms.

If there are additional type records other than 1 and 2 the JOIN version will increase its advantage as it just does merge joins on the relevant sections of the type,product_id index whereas the PIVOT plan uses product_id, type and so would have to scan over the additional type rows that are intermingled with the 1 and 2 rows.