|
|
Free Software, Source Code, and Best-Practice Documentation Index from your
Cleveland Software, Database, Web Design, Consulting, and SQL / Delphi Experts
|
SQL Server - Other: Set-Based Running Subtotals (i.e., Row-Level Accumulators) for Delimited-list Accumulator Functionality
|
|
Have you ever wanted to perform a set-based running-subtotal operation in SQL-Server (i.e., have row-level accumulators store a running total or running subtotal or aggregation of values within a "break level"?) Want the solution to not use cursors?
Well, I'm going to present one cursorless method of implementing such functionality in a SQL-Server Query, Stored-Procedure, or Function.
WHY DO THIS?
I have run into a few situations where this has been extremely useful. A typical situation involves reporting functionality, where a report is supposed to show a list of values that exist for all items in a group, but only report this information in summary at a group level. Like, e.g., you have a part number with (potentially) a lot of sub-parts / components that make up the "parent" product, and you want to generate a report showing (at the product level) information like price, cost, build-time, and a list of sub-components (in summary - like just their part numbers). Well, this is how that can be accomuplished. Also, I have used this technique to "bind" two reporting procedures together where a master-detail report links the details as a comma-delimited list of primary-key values (and then, in the detail report, I use the power of a user-defined function to transform that delimited list of key-values into a table for joining - using this user-defined delimited-list parser SQL function).
I have seen running totals done before, with varied methods, but what I have not seen much about elsewhere is how to do running subtotals. The code I wrote that appears below can easily do both, and does. It is easily adapted to do multi-column break-value running subtotals, as noted in the code comments.
Note: for this example, I used the SQL Server 2005 Adventureworks sample database, which provided me with a test case of approximately 114,000 rows against which I perform the running-subtotal (by product level) logic and do my row-level accumulator manipulations (using SET logic). This query is relatively efficient and took only 15 seconds to execute on my dual Xeon server with RAID array, for the entire operation (before I limited output with TOP() function.
|
|
|
DBCC DROPCLEANBUFFERS -- Clears the data cache
DBCC FREEPROCCACHE -- Clears the procedure cache
GO
--I do the above two steps to make sure run-times aren't showing faster than reality (from prev cached data, etc)
declare @start datetime
set @start = getdate()
DECLARE @RunningSubtotalDelimString VARCHAR(MAX)
SET @RunningSubtotalDelimString = ''
DECLARE @SubTotalBreakValue1 INT
SET @SubTotalBreakValue1 = -999 --Some value that will NOT exist in the real data for column in question
DECLARE @Results TABLE
(
UniqueID INT IDENTITY NOT NULL PRIMARY KEY,
SubtotalBreakColumn1 INT,
ReferenceOrderID INT,
OrderLineIDsForProduct VARCHAR(MAX)
)
INSERT INTO @Results(
SubtotalBreakColumn1,
ReferenceOrderID)
SELECT
ProductID,
ReferenceOrderID
FROM
Production.TransactionHistory
ORDER BY
ProductID, --Insert into table in our subtotal-breaking order (IMPORTANT!)
ReferenceOrderID --and, if we care, sort the OrderIDs too
--Now, let's set those accumulators!
--The subtotal logic relies on the break-level-variable(s) having their values updated AFTER the subtotal is calc'd, since
--they are used for comparisons to "prior row" values essentially.
--This logic can easily be extended to support subtotal-breaks that rely on multiple-column values simply by altering the
--CASE statement to compare more than one column prior to resetting the Subtotal accumulator.
-- i.e., CASE WHEN (@SubTotalBreakValue1 = SubtotalBreakColumn1 AND @SubTotalBreakValue2 = SubtotalBreakColumn2 AND...)
UPDATE
@Results
SET
@RunningSubtotalDelimString =
OrderLineIDsForProduct =
CASE
WHEN @SubTotalBreakValue1 = SubtotalBreakColumn1
THEN @RunningSubtotalDelimString +
CASE WHEN @RunningSubtotalDelimString <> '' THEN ',' ELSE '' END +
CONVERT(VARCHAR(10), ReferenceOrderID)
ELSE CONVERT(VARCHAR(10), ReferenceOrderID)
END,
@SubTotalBreakValue1= SubtotalBreakColumn1
--LET'S SEE WHAT THE DATA LOOKS LIKE NOW...
SELECT TOP(200) * FROM @results
ORDER BY UniqueID
--UniqueID SubtotalBreakColumn1 ReferenceOrderID OrderLineIDsForProduct
------------- -------------------- ---------------- ---------------------------------------------------
--1 1 426 426
--2 1 505 426,505
--3 1 588 426,505,588
--4 1 675 426,505,588,675
--5 1 758 426,505,588,675,758
--6 1 841 426,505,588,675,758,841
--...
--...
--45 1 3931 426,505,588,[...etc...],3852,3931
--46 2 425 425
--47 2 504 425,504
--48 2 587 425,504,587
--49 2 674 425,504,587,674
--...
--...
--Now, perhaps we only want one row with the delimited list of unique values for entire break-level
--/ accumulator level / "subtotal level" / aggregation level / running subtotal level.
--NOTE: Limit to a small sample
SELECT TOP(10)
SubtotalBreakColumn1,
MAX(OrderLineIDsForProduct) AS AccumulatedOrderIDs --The "MAX()" row is one with most OrderIDs
FROM @results
GROUP BY SubtotalBreakColumn1
ORDER BY SubtotalBreakColumn1
--SubtotalBreakColumn1 AccumulatedOrderIDs
---------------------- ------------------------------------------------------------------------------------
--1 426,505,588,675,758,841,924,1011,[...etc...],3378,3457,3536,3615,3694,3773,3852,3931
--2 425,504,587,674,757,836,923,1010,[...etc...],3377,3456,3535,3614,3693,3772,3851,3930
|
|