Reputation: 25
I'm trying to import some data from a .txt file into WPS (which works with SAS language). Everythings worked perfectly until I realized that for one column/variable, some of my observations were cut. For example in the column/variable Type, my 5 first observations in the original file were:
RAN
FG
MAJOFG
ARBA
ARBV
but in my data set in sas, here's the data I get:
RAN
FG
MAJ
ARB
ARB
So I thought about using a format during the import to make sas always take the $8. format. Can it work with proc import? Here is my code:
proc import out = sasuser.frais
datafile="~/NAS/M/XXX.txt"
dbms=dlm replace;
delimiter=";";
getnames=yes;
datarow=2;
run;
I also tried to use data infile but for some reason it moves my lines of data (the beginning of the second line starts at the end of the first one, probably because most of the time my last 4 columns/variables are empty). And I have more than 20 variables, most of which are properly imported by sas (in terms of format I mean), so if I could avoid naming a format for each of them, it would be great!
Upvotes: 1
Views: 20740
Reputation: 25
My problem is that I plan to use this importation code for multiple files and the order of the observations for type is random. So it's totally possible that for 10 or 100 row, the observations have a length of 3 and then one observation of 6. But even if I don't know the order, I know the differents values possible (with a max length of 6). So guessingtows wouldn't work in my case. But I found a way to correct my code with the link provided by John Doe (https://communities.sas.com/t5/Base-SAS-Programming/Text-File-Import-force-the-correct-Informat-and-format/td-p/281701) I took the data generated by the proc import and modified it for the one variable which needed changes. Thank you all!
Upvotes: 0
Reputation: 27508
Try using the GUESSINGROWS
statement:
Proc IMPORT … ;
…
GUESSINGROWS = 100;
run;
When using a DATA
step with an INPUT
statement, and some rows are incomplete, your INFILE
statement should specify the MISSOVER
option. This will help you create a data set where each row corresponds to each row of the data file.
INFILE … FIRSTOBS=2 DLM=';' DSD MISSOVER;
length v1-v20 $30;
INPUT v1-v20;
During the data step implicit loop step, any variables at the end of the data line without corresponding input values will be set to missing, and the input statement will not attempt to look for more values on the next line.
Upvotes: 2
Reputation: 9569
You may be able to get acceptable results by tweaking the guessingrows
parameter rather than specifying an informat, e.g.:
data _null_;
file "%sysfunc(pathname(work))\test.txt";
infile cards;
input;
put _infile_;
cards;
RAN
FG
MAJOFG
ARBA
ARBV
;
run;
proc import out = example
datafile="%sysfunc(pathname(work))\test.txt"
dbms=dlm replace;
delimiter=";";
getnames=yes;
datarow=2;
guessingrows=3;
run;
If you change this from 3 to 2, proc import only looks at the first 2 data rows when guessing an appropriate length for the variable RAN
and truncation occurs for subsequent rows with longer values.
Upvotes: 0