Reputation: 33
I have a particular problem. I have exported a csv file where I on some columns needed to put the data in quoation-marks because of leading zeros, and sometimes a long datanumber includes "E" in them on the export. Now I am trying to import the same file into SAS to see if my proc import-routine works.
When I import the file all of the data comes through, but are compressed into two columns(hence wrong with my delimiter?) when I actually exported 20 columns.
Not all columns are enclosed in quotation-marks, just a couple of them. An example of the data:
CustomerID CustomerName Product Price BillingNR
"01234" Customer 1 Product1 Price1 "03541"
"52465" Customer 2 Product2 Price2 ""
"23454" Customer 3 Product3 Price3 "035411236952154589632154"
CustomerID and BillingNR are then enclosed in quotation marks.
How can I import this dataset when only some of the columns are enclosed in quotation marks while others arent? Or simply remove all double quotes from the when importing? Heres my code:
%macro import;
%if &exist= "Yes" %then %do;
proc import
datafile= "\\mypath\data.csv"
DBMS=CSV
out=Sales
replace;
getnames=YES;
run;
%end;
%else %do;
%put Nothing happens;
%end;
%mend;
%lesInn;
The IF/ELSE-test is just another macro where i test if the file specified exists. I have tried to research different methods, and am still looking for similar problems, but nothing have seemed to work.
All answers much appreciated.
Toor
Upvotes: 0
Views: 3691
Reputation: 51566
If you read the file using the DSD option then SAS will automatically remove the quotes from around the values. Even quotes that are around values that do not need to be quoted, like most of your example data.
data want ;
infile cards dsd truncover firstobs=2;
length CustomerID $5 CustomerName $20 Product $20 Price $8 BillingNR $30 ;
input CustomerID -- BillingNR ;
cards;
CustomerID,CustomerName,Product,Price,BillingNR
"01234",Customer 1,Product1,Price1,"03541"
"52465",Customer 2,Product2,Price2,""
"23454",Customer 3,Product3,Price3,"035411236952154589632154"
;
Upvotes: 0
Reputation: 21264
CSV -> Comma Separated Values I don't see commas being used as your delimiters, but pipes.
Specify that your delimiter is a pipe, and increase the GUESSINGROWS option to a large number so it assigns the correct length and type.
Proc import ... DBMS = DLM Replace;
Delimiter='|';
GuessingRows=10000;
....remaining options;
Run;
I'm still not sure Proc Import will work. If it doesn't you'll need to write the data step code and make sure to specify the DSD option which will deal with the quotes.
Edit: Based on question edit, most accurate method is to read via a data step. As mentioned the DSD option will handle the quotes.
Upvotes: 0