Most efficient method to detect column change in MS SQL Server

mwigdahl picture mwigdahl · Mar 16, 2009 · Viewed 79.1k times · Source

Our system runs on SQL Server 2000, and we are in the process of preparing for an upgrade to SQL Server 2008. We have a lot of trigger code where we need to detect a change in a given column and then operate on that column if it has changed.

Obviously SQL Server provides the UPDATE() and COLUMNS_UPDATED() functions, but these functions only tell you which columns have been implicated in the SQL statement, not which columns have actually changed.

To determine which columns have changed, you need code similar to the following (for a column that supports NULLs):

IF UPDATE(Col1)
    SELECT @col1_changed = COUNT(*) 
    FROM Inserted i
        INNER JOIN Deleted d ON i.Table_ID = d.Table_ID
    WHERE ISNULL(i.Col1, '<unique null value>') 
            != ISNULL(i.Col1, '<unique null value>')

This code needs to be repeated for every column you are interested in testing. You can then check the 'changed' value to determine whether or not to perform expensive operations. Of course, this code is itself problematic, as it only tells you that at least one value in the column has changed over all the rows that were modified.

You can test individual UPDATE statements with something like this:

UPDATE Table SET Col1 = CASE WHEN i.Col1 = d.Col1 
          THEN Col1 
          ELSE dbo.fnTransform(Col1) END
FROM Inserted i
    INNER JOIN Deleted d ON i.Table_ID = d.Table_ID

... but this doesn't work well when you are needing to invoke a stored procedure. In those cases you have to fall back on other approaches as far as I can tell.

My question is whether anyone has insight (or, better yet, hard data) as to what the best/cheapest approach is to the problem of predicating a database operation in a trigger on whether a particular column value in a modified row has actually changed or not. Neither of the methods above seem ideal, and I was wondering if a better method exists.

Answer

HLGEM picture HLGEM · Mar 16, 2009

Let's start with I would never and I mean never invoke a stored proc in a trigger. To account for a multi row insert you would have to cursor through the proc. This means the 200,000 rows you just loaded though a set-based query (say upddating all prices by 10%) might well lock the table for hours as the trigger tries valiantly to handle the load. Plus if something changes in the proc, you could break any inserts to the table at all or even completely hang up the table. I'm a firm beliver that trigger code should call nothing else outside the trigger.

Personally I prefer to simply do my task. If I have written the actions I want to do properly in the trigger it will only update, delete or insert where columns have changed.

Example: suppose you want to update the last_name field that you are storing in two places due to a denormalization placed there for performance reasons.

update t
set lname = i.lname
from table2 t 
join inserted i on t.fkfield = i.pkfield
where t.lname <>i.lname

As you can see it would only update the lnames that are different than what is currently in the table I am updating.

If you want to do auditing and record only those rows which changed then do the comparison using all fields something like where i.field1 <> d.field1 or i.field2 <> d.field3 (etc through all the fields)