I'm writing a process that archives rows from a SQL Server table based on a datetime column. I want to move all the rows with a date before X, but the problem is that there are millions of rows for each date, so doing a BEGIN TRANSACTION...INSERT...DELETE...COMMIT for each date takes too long, and locks up the database for other users.
Is there a way that I can do it in smaller chunks? Maybe using ROWCOUNT or something like that?
I'd originally considered something like this:
SET ROWCOUNT 1000
DECLARE @RowsLeft DATETIME
DECLARE @ArchiveDate DATETIME
SET @ROWSLEFT = (SELECT TOP 1 dtcol FROM Events WHERE dtcol <= @ArchiveDate)
WHILE @ROWSLEFT IS NOT NULL
BEGIN
INSERT INTO EventsBackups
SELECT top 1000 * FROM Events
DELETE Events
SET @ROWSLEFT = (SELECT TOP 1 dtcol FROM Events WHERE dtcol <= @ArchiveDate)
END
But then I realized that I can't guarantee that the rows I'm deleting are the ones I just backed up. Or can I...?
UPDATE: Another options I'd considered was adding a step:
Does anybody have an idea for how the expense of this series might compare to some of the other options discussed below?
DETAIL: I'm using SQL 2005, since somebody asked.
Just INSERT the result of the DELETE:
WHILE 1=1
BEGIN
WITH EventsTop1000 AS (
SELECT TOP 1000 *
FROM Events
WHERE <yourconditionofchoice>)
DELETE EventsTop1000
OUTPUT DELETED.*
INTO EventsBackup;
IF (@@ROWCOUNT = 0)
BREAK;
END
This is atomic and consistent.