Francisco Ariel
Francisco Ariel

Reputation: 35

Read data with large strings with SAS

I want to read a .csv file with large strings with SAS. This is my file tmp.csv in comma separated values format

1,1005725,[(B42.ND761).B437]1-8-1-1-1-3-3-3-2-2/RT0658,5S3563A/RT0658,,,5S3563A,RT0658
2,09VL101,20347 PL6 O94 E98-1-0/K9616LM,19058/K9616LM,19058,,19058,K9616LM
3,09VL102,20351 PL6-1-0/K9616LM 19060/K9616LM,,19060,,19060,K9616LM
4,09VL103,20347 PL6 O94 E98-2-0/K9962LM,AID19058A/K9962LM,19058,,AID19058A,K9962LM
5,09VL105,,V4649A/F0001LM,,,V4649A,F0001LM

I've used this code, but it hasn´t worked.

DATA datos;
INFILE "C:\Users\UserName\Documents\tmp.csv" DLM="," DSD MISSOVER;
INPUT Num Code :$7. Pedigree  : $44. LineCode : $17. FemaleCode $5. MaleCode $ NFemale $9. NMale $7. ;
RUN;

This should be the result

Correct Data

Upvotes: 0

Views: 133

Answers (2)

Tom
Tom

Reputation: 51566

I know it seems like you are saving typing by putting the informats in the input statement, but I think it is much easier to define the variables first and then write the input statement. Especially when reading from a delimited file. If you define the variables in the same order that you want to read them you can even just use a variable list in the INPUT statement.

DATA datos;
  INFILE "C:\Users\UserName\Documents\tmp.csv" DSD TRUNCOVER;
  LENGTH NumCode $7 Pedigree $44 LineCode $17 FemaleCode $5 NFemale $9 NMale $7 ;
  INPUT NumCode -- NMale ;
RUN;

Also it is generally better to use TRUNCOVER instead of MISSOVER option on the INFILE statement. Most of the time you do not want SAS to set the value to missing when you ask it to read 7 characters and there are only 3 available on the line. You would prefer the have SAS use the 3 characters that are available. It won't make a difference on delimited input, but if you use formatted input without the : modifier you can miss data.

Upvotes: 0

Sean
Sean

Reputation: 1120

I think Joe has the right idea - your variable lengths are messed up. I was able to produce the desired result using your code but with some renaming and resizing of your variables.

DATA datos;
    INFILE "C:\Users\UserName\Documents\tmp.csv" DLM="," DSD MISSOVER;
    INPUT a:$1. b:$7. c:$44. d:$17. e:$5. f:$9. g:$7.;
RUN;

enter image description here

Upvotes: 1

Related Questions