Get unique values using STRING_AGG in SQL Server

gkoul picture gkoul · May 29, 2018 · Viewed 33.7k times · Source

The following query returns the results shown below:

SELECT 
    ProjectID, newID.value
FROM 
    [dbo].[Data] WITH(NOLOCK)  
CROSS APPLY 
    STRING_SPLIT([bID],';') AS newID  
WHERE 
    newID.value IN ('O95833', 'Q96NY7-2') 

Results:

ProjectID   value
---------------------
2           Q96NY7-2
2           O95833
2           O95833
2           Q96NY7-2
2           O95833
2           Q96NY7-2
4           Q96NY7-2
4           Q96NY7-2

Using the newly added STRING_AGG function (in SQL Server 2017) as it is shown in the following query I am able to get the result-set below.

SELECT 
    ProjectID,
    STRING_AGG( newID.value, ',') WITHIN GROUP (ORDER BY newID.value) AS 
NewField
FROM
    [dbo].[Data] WITH(NOLOCK)  
CROSS APPLY 
    STRING_SPLIT([bID],';') AS newID  
WHERE 
    newID.value IN ('O95833', 'Q96NY7-2')  
GROUP BY 
    ProjectID
ORDER BY 
    ProjectID

Results:

ProjectID   NewField
-------------------------------------------------------------
2           O95833,O95833,O95833,Q96NY7-2,Q96NY7-2,Q96NY7-2
4           Q96NY7-2,Q96NY7-2

I would like my final output to have only unique elements as below:

ProjectID   NewField
-------------------------------
2           O95833, Q96NY7-2
4           Q96NY7-2

Any suggestions about how to get this result? Please feel free to refine/redesign from scratch my query if needed.

Answer

JohnLBevan picture JohnLBevan · May 29, 2018

Use the DISTINCT keyword in a subquery to remove duplicates before combining the results: SQL Fiddle

SELECT 
ProjectID
,STRING_AGG(value, ',') WITHIN GROUP (ORDER BY value) AS 
NewField
from (
    select distinct ProjectId, newId.value 
    FROM [dbo].[Data] WITH(NOLOCK)  
    CROSS APPLY STRING_SPLIT([bID],';') AS newID  
    WHERE newID.value IN (   'O95833' , 'Q96NY7-2'  )  
) x
GROUP BY ProjectID
ORDER BY ProjectID