Select 1000 distinct names from 100 million records via standard sql

whytheq picture whytheq · Mar 29, 2013 · Viewed 9.7k times · Source

I have a table tb_FirstName with one field FirstName. The table has 100 million non null records with lots of repetitions e.g. John occurs 2 million times. The distinct count of FirstName is over 2 million.

How do I select 1000 distinct names as quickly as possible using standard sql?

I'm currently using the following but this is

  • tSQL
  • Maybe not as efficient as it could be.

    SELECT x.FirstName
    FROM (
        SELECT  FirstName,
                rnk = RANK() OVER (ORDER BY Firstname)
        FROM    WHData.dbo.tb_DimUserAccount A
        GROUP BY FirstName
        ) x
    WHERE rnk <=1000
    

Answer

sgeddes picture sgeddes · Mar 29, 2013

Seems like you could use TOP 1000 with DISTINCT:

SELECT DISINCT TOP 1000 FirstName
FROM WHData.dbo.tb_DimUserAccount
ORDER BY FirstName

Condensed SQL Fiddle Demo