I`ve got (and will receive in the future) many CSV files that use the semicolon as delimiter and the comma as decimal separator. So far I could not find out how to import these files into SAS using proc import -- or in any other automated fashion without the need for messing around with the variable names manually.
Create some sample data:
%let filename = %sysfunc(pathname(work))\sap.csv;
data _null_;
file "&filename";
put 'a;b';
put '12345,11;67890,66';
run;
The import code:
proc import out = sap01
datafile= "&filename"
dbms = dlm;
delimiter = ";";
GETNAMES = YES;
run;
After the import a value for the variable "AMOUNT" such as 350,58 (which corresponds to 350.58 in the US format) would look like 35,058 (meaning thirtyfivethousand...) in SAS (and after re-export to the German EXCEL it would look like 35.058,00). A simple but dirty workaround would be the following:
data sap02; set sap01;
AMOUNT = AMOUNT/100;
format AMOUNT best15.2;
run;
I wonder if there is a simple way to define the decimal separator for the CVS-import (similar to the specification of the delimiter). ..or any other "cleaner" solution compared to my workaround. Many thanks in advance!
You technically should use dbms=dlm
not dbms=csv
, though it does figure things out. CSV means "Comma separated values", while DLM means "delimited", which is correct here.
I don't think there's a direct way to make SAS read in with the comma via PROC IMPORT
. You need to tell SAS to use the NUMXw.d informat when reading in the data, and I don't see a way to force that setting in SAS. (There's an option for output with a comma, NLDECSEPARATOR
, but I don't think that works here.)
Your best bet is either to write data step code yourself, or to run the PROC IMPORT
, go to the log, and copy/paste the read in code into your program; then for each of the read-in records add :NUMX10.
or whatever the appropriate maximum width of the field is. It will end up looking something like this:
data want;
infile "whatever.txt" dlm=';' lrecl=32767 missover;
input
firstnumvar :NUMX10.
secondnumvar :NUMX10.
thirdnumvar :NUMX10.
fourthnumvar :NUMX10.
charvar :$15.
charvar2 :$15.
;
run;
It will also generate lots of informat and format code; you can alternately convert the informats to NUMX10.
instead of BEST.
instead of adding the informat to the read-in. You can also just remove the informats, unless you have date fields.
data want;
infile "whatever.txt" dlm=';' lrecl=32767 missover;
informat firstnumvar secondnumvar thirdnumvar fourthnumvar NUMX10.;
informat charvar $15.;
format firstnumvar secondnumvar thirdnumvar fourthnumvar BEST12.;
format charvar $15.;
input
firstnumvar
secondnumvar
thirdnumvar
fourthnumvar
charvar $
;
run;