How to remove duplicated records\observations WITHOUT sorting in SAS？

Question 1

How to remove duplicated records\observations WITHOUT sorting in SAS？

sorting sas duplicates

mj023119 · Apr 18, 2011 · Viewed 23.4k times · Source

Answer

Answer

You could use a hash object to keep track of which values have been seen as you pass through the data set. Only output when you encounter a key that hasn't been observed yet. This outputs in the order the data was observed in the input data set.

Here is an example using the input data set "sashelp.cars". The original data was in alphabetical order by Make so you can see that the output data set "nodupes" maintains that same order.

data nodupes (drop=rc);;
  length Make $13.;

  declare hash found_keys();
    found_keys.definekey('Make');
    found_keys.definedone();

  do while (not done);
    set sashelp.cars end=done;
    rc=found_keys.check();
    if rc^=0 then do;      
      rc=found_keys.add(); 
      output;              
    end;
  end;
  stop;
run;

proc print data=nodupes;run;

Question 2

I wonder if there is a way to unduplicate records WITHOUT sorting?Sometimes, I want to keep original order and just want to remove duplicated records.

Is it possible?

BTW, below are what I know regarding unduplicating records, which does sorting in the end..

1.

proc sql;
   create table yourdata_nodupe as
   select distinct *
   From abc;
quit;

2.

proc sort data=YOURDATA nodupkey;    
    by var1 var2 var3 var4 var5;    
run;

How to remove duplicated records\observations WITHOUT sorting in SAS？

Answer

Related questions