Getting unique values in Excel by using formulas only

Patrick Honorez picture Patrick Honorez · Sep 16, 2009 · Viewed 407.6k times · Source

Do you know a way in Excel to "calculate" by formula a list of unique values ?

E.g: a vertical range contains values "red", "blue", "red", "green", "blue", "black"
and I want to have as result "red, "blue", "green", "black" + eventually 2 other blank cells.

I already found a way to get a calculated sorted list using SMALL or LARGE combined with INDEX, but I'd like to have this calculated sort as well, WITHOUT USING VBA.

Answer

Drew Sherman picture Drew Sherman · Sep 17, 2009

Ok, I have two ideas for you. Hopefully one of them will get you where you need to go. Note that the first one ignores the request to do this as a formula since that solution is not pretty. I figured I make sure the easy way really wouldn't work for you ;^).

Use the Advanced Filter command

  1. Select the list (or put your selection anywhere inside the list and click ok if the dialog comes up complaining that Excel does not know if your list contains headers or not)
  2. Choose Data/Advanced Filter
  3. Choose either "Filter the list, in-place" or "Copy to another location"
  4. Click "Unique records only"
  5. Click ok
  6. You are done. A unique list is created either in place or at a new location. Note that you can record this action to create a one line VBA script to do this which could then possible be generalized to work in other situations for you (e.g. without the manual steps listed above).

Using Formulas (note that I'm building on Locksfree solution to end up with a list with no holes)

This solution will work with the following caveats:

  • The list must be sorted (ascending or descending does not matter). Actually that's quite accurate as the requirement is really that all like items must be contiguous but sorting is the easiest way to reach that state.
  • Three new columns are required (two new columns for calculations and one new column for the new list). The second and third columns could be combined but I'll leave that as an exercise to the reader.
  • Here is the summary of the solution:

    1. For each item in the list, calculate the number of duplicates above it.
    2. For each place in the unique list, calculate the index of the next unique item.
    3. Finally, use the indexes to create a new list with only unique items.

    And here is a step by step example:

    1. Open a new spreadsheet
    2. In a1:a6 enter the example given in the original question ("red", "blue", "red", "green", "blue", "black")
    3. Sort the list: put the selection in the list and choose the sort command.
    4. In column B, calculate the duplicates:
      1. In B1, enter "=IF(COUNTIF($A$1:A1,A1) = 1,0,COUNTIF(A1:$A$6,A1))". Note that the "$" in the cell references are very important as it will make the next step (populating the rest of the column) much easier. The "$" indicates an absolute reference so that when the cell content is copy/pasted the reference will not update (as opposed to a relative reference which will update).
      2. Use smart copy to populate the rest of column B: Select B1. Move your mouse over the black square in the lower right hand corner of the selection. Click and drag down to the bottom of the list (B6). When you release, the formula will be copied into B2:B6 with the relative references updated.
      3. The value of B1:B6 should now be "0,0,1,0,0,1". Notice that the "1" entries indicate duplicates.
    5. In Column C, create an index of unique items:
      1. In C1, enter "=Row()". You really just want C1 = 1 but using Row() means this solution will work even if the list does not start in row 1.
      2. In C2, enter "=IF(C1+1<=ROW($B$6), C1+1+INDEX($B$1:$B$6,C1+1),C1+1)". The "if" is being used to stop a #REF from being produced when the index reaches the end of the list.
      3. Use smart copy to populate C3:C6.
      4. The value of C1:C6 should be "1,2,4,5,7,8"
    6. In column D, create the new unique list:
      1. In D1, enter "=IF(C1<=ROW($A$6), INDEX($A$1:$A$6,C1), "")". And, the "if" is being used to stop the #REF case when the index goes beyond the end of the list.
      2. Use smart copy to populate D2:D6.
      3. The values of D1:D6 should now be "black","blue","green","red","","".

    Hope this helps....