Reputation: 45
I want to read into SAS a text file data set that uses two different delimiters, "|" and the string "[end text]". It is arranged as below:
var1|var2|var3
4657|366|text that
has some line
breaks [end text]
45|264| more text that has
line breaks [end text]
I am trying to figure out how to recognize both of these these two delimiters. I tried to use the DLMSTR option, but this didn't work:
data new ;
infile 'file.txt' dlmstr='|'||'[report_end]' DSD firstobs=2 ;
input var1 var2 var3 $;
run;
Is there any way to use these two delimiters at the same time? Or am I using the wrong input style to import my data?
Upvotes: 2
Views: 1273
Reputation: 51566
SAS can read delimited files that have embedded line breaks as long as the embedded line breaks use different character than the normal end of line. So if your real observations end with CRLF (normal for a Windows text file) and the embedded line breaks are just single LF character then those extra breaks will be treated as just another character in that field.
var1|var2|var3<CR><LF>
4657|366|text that<LF>
has some line<LF>
breaks [end text]<CR><LF>
45|264| more text that has<LF>
line breaks [end text]<CR><LF>
For example here is a data step that could convert your original file.
data _null_;
infile original lrecl=32767 ;
file copy lrecl=1000000 termstr=lf ;
input ;
_infile_ = tranwrd(_infile_,'[end text]','0d'x);
if _n_=1 then _infile_=trim(_infile_)||'0d'x;
len = length(_infile_);
put _infile_ $varying32767. len ;
run;
But it might be better to replace the embedded line breaks with some other character , like ^, instead.
data _null_;
infile original truncover ;
file copy lrecl=1000000 ;
input line $char32767.;
len = length(line);
put line $varying32767. len @;
if _n_=1 or index(_infile_,'[end text]') then put ;
else put '^' @;
run;
Result:
var1|var2|var3
4657|366|text that^has some line^breaks [end text]
45|264| more text that has^line breaks [end text]
Which is easy to read.
Obs var1 var2 var3
1 4657 366 text that^has some line^breaks [end text]
2 45 264 more text that has^line breaks [end text]
Upvotes: 1