Select running total until specific SUM is reached

Snowy picture Snowy · Jan 24, 2013 · Viewed 9.6k times · Source

I am trying to select the first n rowid values from the following table variable that will get me as close to a sum(itemcount) of 200,000 without crossing that threshhold. If I was looking at this manually, I would just take the top 3 rows. I do not want to use a cursor unless there is no pure-set-based way.

What is a good set-based way to get all of the rowid values "sum while/until" I get to a running total of 200,000?

I looked at "running totals" at http://www.1keydata.com/sql/sql-running-totals.html but that did not seem like it would work out because the real table has around 500k rows.

Here is what I have tried so far:

declare  @agestuff table ( rowid int primary key , itemcount int , itemage datetime )
insert into @agestuff values ( 1 , 175000 , '2013-01-24 17:21:40' )
insert into @agestuff values ( 2 , 300    , '2013-01-24 17:22:11' )
insert into @agestuff values ( 3 , 10000 , '2013-01-24 17:22:11' )
insert into @agestuff values ( 4 , 19000 , '2013-01-24 17:22:19' )
insert into @agestuff values ( 5 , 16000 , '2013-01-24 17:22:22' )
insert into @agestuff values ( 6 , 400   , '2013-01-24 17:23:06' )
insert into @agestuff values ( 7 , 25000 , '2013-01-24 17:23:06' )

select sum(itemcount) from @agestuff  -- 245700 which is too many

select sum(itemcount) from @agestuff  
  where rowid in (1,2,3) -- 185300 which gets me as close as possible

Using SQL Server 2008. I'll switch to 2012 if necessary.

Answer

Aaron Bertrand picture Aaron Bertrand · Jan 24, 2013

Windowing Functions - SQL Server 2012 only

DECLARE @point INT = 200000;

;WITH x(rowid, ic, r, s) AS
(
  SELECT
    rowid, itemcount, ROW_NUMBER() OVER (ORDER BY itemage, rowid),
    SUM(itemcount) OVER (ORDER BY [itemage], rowid RANGE UNBOUNDED PRECEDING)
  FROM @agestuff
)
SELECT x.rowid, x.ic, x.s
FROM x WHERE x.s <= @point
ORDER BY x.rowid; 

Results:

rowid  ic      sum   
-----  ------  ------
1      175000  175000
2      300     175300
3      10000   185300

SQL fiddle demo

If you can't use SQL Server 2012 for some reason, then on SQL Server 2008 you can use a couple of alternatives:


Quirky Update

Note that this behavior is not documented, nor is it guaranteed to calculate your running totals in the correct order. So please use at your own risk.

DECLARE @st TABLE
(
    rowid INT PRIMARY KEY,
    itemcount INT,
    s INT
);
 
DECLARE @RunningTotal INT = 0;
 
INSERT @st(rowid, itemcount, s)
  SELECT rowid, itemcount, 0
    FROM @agestuff
    ORDER BY rowid;
 
UPDATE @st
  SET @RunningTotal = s = @RunningTotal + itemcount
  FROM @st;
 
SELECT rowid, itemcount, s
  FROM @st
  WHERE s < @point
  ORDER BY rowid;

Cursor

DECLARE @st TABLE
(
  rowid INT PRIMARY KEY, itemcount INT, s INT
);
 
DECLARE
  @rowid INT, @itemcount INT, @RunningTotal INT = 0;
 
DECLARE c CURSOR LOCAL FAST_FORWARD
  FOR SELECT rowid, itemcount
    FROM @agestuff ORDER BY rowid;
 
OPEN c;
 
FETCH c INTO @rowid, @itemcount;
 
WHILE @@FETCH_STATUS = 0
BEGIN
    SET @RunningTotal = @RunningTotal + @itemcount;

    IF @RunningTotal > @point
      BREAK;
 
    INSERT @st(rowid, itemcount, s)
      SELECT @rowid, @itemcount, @RunningTotal;
 
    FETCH c INTO @rowid, @itemcount;
END
 
CLOSE c;
DEALLOCATE c;
 
SELECT rowid, itemcount, s
  FROM @st
  ORDER BY rowid;

I chose only two alternatives because others are even less desirable (mostly from a performance perspective). You can see them in the following blog post, with some background on how they perform and more information about potential gotchas. Don't paint yourself into a corner because you're stuck on the idea that cursors are bad - sometimes, like in this case, they can be the most efficient supported and reliable option:

http://www.sqlperformance.com/2012/07/t-sql-queries/running-totals