I have a case where I need to translate (lookup) several values from the same table. The first way I wrote it, was using subqueries:
SELECT
(SELECT id FROM user WHERE user_pk = created_by) AS creator,
(SELECT id FROM user WHERE user_pk = updated_by) AS updater,
(SELECT id FROM user WHERE user_pk = owned_by) AS owner,
[name]
FROM asset
As I'm using this subquery a lot (that is, I have about 50 tables with these fields), and I might need to add some more code to the subquery (for example, "AND active = 1" ) I thought I'd put these into a user-defined function UDF and use that. But the performance using that UDF was abysmal.
CREATE FUNCTION dbo.get_user ( @user_pk INT )
RETURNS INT
AS BEGIN
RETURN ( SELECT id
FROM ice.dbo.[user]
WHERE user_pk = @user_pk )
END
SELECT dbo.get_user(created_by) as creator, [name]
FROM asset
The performance of #1 is less than 1 second. Performance of #2 is about 30 seconds...
Why, or more importantly, is there any way I can code in SQL server 2008, so that I don't have to use so many subqueries?
Just a litte more explanation of when this is useful. This simple query (that is, get userid) gets a lot more complex when I want to have a text for a user, since I have to join with profile to get the language, with a company to see if the language should be fetch'ed from there instead, and with the translation table to get the translated text. And for most of these queries, performance is a secondary issue to readability and maintainability.
The UDF is a black box to the query optimiser so it's executed for every row. You are doing a row-by-row cursor. For each row in an asset, look up an id three times in another table. This happens when you use scalar or multi-statement UDFs (In-line UDFs are simply macros that expand into the outer query)
One of many articles on the problem is "Scalar functions, inlining, and performance: An entertaining title for a boring post".
The sub-queries can be optimised to correlate and avoid the row-by-row operations.
What you really want is this:
SELECT
uc.id AS creator,
uu.id AS updater,
uo.id AS owner,
a.[name]
FROM
asset a
JOIN
user uc ON uc.user_pk = a.created_by
JOIN
user uu ON uu.user_pk = a.updated_by
JOIN
user uo ON uo.user_pk = a.owned_by
Update Feb 2019
SQL Server 2019 starts to fix this problem.