Reputation: 109
Here is the part of dataset:
Obs Buffer
...
75 14
76 13
77 64
78 38.1%
79 29.2%
80 69.2%
81 33
82 5-12
...
I only need the data containing "%" and the two rows ahead of this. For example, in this case I want to pull out "13" "64" "38.1%" "29.2%" and "69.2%".
Is there a way I can achieve this?
Upvotes: 1
Views: 739
Reputation: 109
By following kungfujam's thought, I get the code below and it works.
data adjust;
set source;
oneBefore = lag1(Buffer);
twoBefore = lag2(Buffer);
threeBefore = lag3(Buffer);
fourBefore = lag4(Buffer);
if index(buffer,'%')^=0 and index(onebefore,'%')^=0 and index(twobefore,'%')^=0then do;
x = fourBefore;
output;
x = threeBefore;
output;
x = twoBefore;
output;
x = oneBefore;
output;
x=buffer;
output;
end;
keep x;
run;
Upvotes: 0
Reputation: 8335
To answer your question: the _N_
variable will return you the number of times that the data step has looped past the data statement. (http://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a000695104.htm)
However, to solve your problem use lag()
(http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000212547.htm) and a contains statement e.g.
data justBuffers;
set yourDS;
twoBefore = lag2(Buffer);
oneBefore = lag1(Buffer);
if Buffer ? '%' then do;
if not missing(twoBefore) then do;
x = twoBefore;
output;
end;
if not missing(oneBefore) then do;
x = oneBefore;
output;
end;
x = Buffer;
output;
call missing(oneBefore, twoBefore);
end;
keep x;
run;
I've not tested the code so watch out! I'm sure you could make it smoother.
Upvotes: 0
Reputation: 63424
I like using point
for this sort of thing. _N_
is reliable as a row counter as long as you're not doing anything funny with the data step loop.
data have;
length buffer $50;
input obs buffer $;
datalines;
75 14
76 13
77 64
78 38.1%
79 29.2%
80 69.2%
81 33
82 5-12
;;;;
run;
data want;
set have;
pointer=_N_;
if find(buffer,'%') then do;
output;
pointer=_N_-1;
set have point=pointer;
if not (find(buffer,'%')) then do;
output;
pointer=_N_-2;
set have point=pointer;
if not (find(buffer,'%')) then output;
end;
end;
run;
If you need to restore your order you can sort by pointer
afterwards (or obs
, if that is a real variable - I assume it is not). If obs
is indeed a real variable (or if you make it real with a view), there is an interesting way in SQL to do this:
proc sql;
create table want as
select H.* from have H
left join have V on H.obs=V.obs-1
left join have A on H.obs=A.obs-2
where
(find(H.buffer,'%'))
or
(find(V.buffer,'%'))
or
(find(A.buffer,'%'))
order by H.obs
;
quit;
How to make obs real without a data pass:
data have_vw/view=have_vw;
set have;
obs=_n_;
run;
And then use have_vw
instead of have
in the SQL query (in all three spots).
Upvotes: 1