Concatenate one field after GROUP BY

Moslem Ben Dhaou picture Moslem Ben Dhaou · Nov 30, 2012 · Viewed 9.3k times · Source

This question have been asked many times in SO but none of the answers is satisfying to my situation.

  1. Question 1
  2. Question 2
  3. Question 3
  4. Question 4

I am dealing with a DataObjectVersions table that contains multiple versions for around 1.2 million unique objects (and increasing). I need to concatenate changes from a specific field for each unique object.

Right now I am using the solution with the XML Path presented in Q3 but running such a query on this table is a total performance disaster. SQL Server started to retun Data after 19mn. Knowing that this data will be than joined twice, you can imagine the impact.

I am looking for the most efficient scalability-aware way to concatenate the values of the same fields of different rows grouped by an other field (which is not of course a key). To be more precise, this is used within a view in a Datawarehouse.

EDIT:

I tried to simplify the description but here is a complete overview I have multiple tables with the following columns

   [ID]
   [CreatedTime]
   [CreatedBy]
   [DeletedTime]
   [DeletedBy]
   [ResourceId]
   [AccountId]
   [Type]

A view is used to return the union of all records from all tables, which will still return the same columns (described in my questions by the versions table). [ResourceId] and [AccountId] are a unique composite identifier of an object (Group membership, System account, etc.. a resource assignment specifically). The [Type] is used to identify different levels (like Read/Write/Execute in the case of a file assignment)

All other fields contain the same values (in different tables) for different unique objects. I need to get the objects and concatenate the values of the [Type] column. All the row are processed afterward and the ([ResourceId],[AccountId]) combination must be unique (not the case when different types exists).

EDIT 2:

I am using this function:

CREATE FUNCTION [dbo].[GetUniqueType]
(
    @ResourceId as uniqueidentifier,
    @Account as uniqueidentifier
)
RETURNS nvarchar(100)
AS
BEGIN   
    return STUFF((select ',' + raType.Type from vwAllAssignments raType where raType.AccountId = @Account and raType.ResourceId = @ResourceId and raType.DeletedBy is null for xml path('')), 1,1,'')
END

GO

vwAllAssignments is the view returning the union of all tables rows.

Finally I am selecting

SELECT [CreatedTime]
      ,[DeletedTime]
      ,[DeletedBy]
      ,[ResourceId]
      ,[AccountId]
      ,dbo.GetUniqueType([ResourceId],[AccountId]) AS [Type]
FROM vwAllAssignments
GROUP BY [ResourceId], [AccountId], [CreatedTime], [DeletedTime], [DeletedBy]

Answer

Mikael Eriksson picture Mikael Eriksson · Nov 30, 2012

Try this:

SELECT [CreatedTime]
      ,[DeletedTime]
      ,[DeletedBy]
      ,[ResourceId]
      ,[AccountId]
      ,STUFF((select ',' + raType.Type 
              from vwAllAssignments raType 
              where raType.AccountId = vwAllAssignments.AccountId and 
                    raType.ResourceId = vwAllAssignments.ResourceId and 
                    raType.DeletedBy is null 
              for xml path('')), 1,1,'') AS [Type]
FROM vwAllAssignments
GROUP BY [ResourceId], [AccountId], [CreatedTime], [DeletedTime], [DeletedBy]

And an index like this should be helpful.

create index IX_vwAllAssignments on vwAllAssignments(AccountId, ResourceId, DeletedBy) include(Type)