SAS variable concatenation through data step

Yoh picture Yoh · Apr 29, 2011 · Viewed 19.7k times · Source

I am looking for a way to create a string variable containing certain values of the dataset while going through the data step.

Example data set work.test:

AddToStringYN    Value
     Y           One
     Y           Two
     N           Three
     Y           Four
     N           Five

So in the end, the variable would look like: OneTwoFour (or even better FourTwoOne). This looks so simple, but I can't seem to find a way to do it. I also tried to work with macro variables like this:

%let stringvar=;
Data _null_;
  set work.test;
  if AddToStringYN = "Y" then do;
    call symput('stringvar',"&stringvar" || strip(value));
  end;
Run;

But this gives:

GLOBAL STRINGVAR Four

So I only get the last value. I get that this must be because of some misunderstanding of mine about this macro facility, but I don't understand why there is only the last value in the variable. I thought it was only the last time the symput was called that it was actually executed or something, but then when I adjust the code to:

%let stringvar=;
Data _null_;
  set work.test;
  if AddToStringYN = "Y" then do;
    call symput('stringvar'||strip(value),"&stringvar" || strip(value));
  end;
Run;

Then I do get them all:

GLOBAL STRINGVARONE  One
GLOBAL STRINGVARTWO  Two
GLOBAL STRINGVARFOUR  Four

So my last guess is that going through the data step, the 'call symput...' line is actually added to the macro processor where the "&stringvar" is already replaced and only after the final statement are they all executed.
Is this a good assumption or is there another explanation? And back to the original question: is there an easy way to achieve this (having the desired variable)?

Answer

sasfrog picture sasfrog · Apr 29, 2011

The following is my answer to your identical question on RunSubmit.com. I think you and @Fabio may be over-engineering the solution, it doesn't need any iterating data step code at all...

First, the easy way to do what you're trying to do is like this:

proc sql;
  select Value into :StringVar separated by ''
    from work.test
    where AddToStringYN='Y'
    ;
quit;

Here, you can take advantage of the SQL interface with SAS/MACRO, using the select into syntax. You could even add an order by clause to get a particular order you're looking for.

Second, since you've happened upon something about the way SAS macro works and you're keen to understand it: in your first example, the first thing the compiler does before executing your code is to resolve the value of &stringvar, which at that point is empty. So after compilation, with this token replaced, your code looks like this to SAS...

%let stringvar=;
Data _null_;
  set work.test;
  if AddToStringYN = "Y" then do;
    call symput('stringvar',"" || strip(value));
  end;
Run;

...then SAS goes ahead and runs that code (which happens to be valid code, but is concatenating an empty string to the start of something). And because of the way the data step works, each iteration of the data step is in fact replacing the value of StringVar, which is why at the end of the data step, it's left with the last value that was read in.