Reputation: 763
I have the following data set which is a set of variables and their respective p values and R squared values from a simple linear regression.
data have;
input Variable$ Probt R_value tie$;
cards;
X1 0.0016 0.4344 .
X2 0.0003 0.5204 .
X3 0.0001 0.7497 yes
X4 0.0001 0.9026 yes
run;
However, as you can see there are two variables that have the Probt
value of 0.001 and I have created a variable called tie
to capture a situation when two variables have the same p value.
What I want is the following. In situation where there is a tie, I want to break the tie by picking the variable with the highest R_value from the tie
variable so that it looks like the following
data want;
input Variable$ Probt R_value tie$;
cards;
X1 0.0016 0.4344 .
X2 0.0003 0.5204 .
X4 0.0001 0.9026 yes
run;
Upvotes: 1
Views: 114
Reputation: 3315
something like below. but beware of compute tie value as mentioned by @reeza and @joe
data have;
input Variable$ Probt R_value tie$;
cards;
X1 0.0016 0.4344 .
X2 0.0003 0.5204 .
X3 0.0001 0.7497 yes
X4 0.0001 0.9026 yes
X5 0.0001 0.9028 yes
X6 0.0002 0.7499 yes
X7 0.0002 0.9027 yes
run;
proc sql;
create table want as
select * from have a
where R_value not in
(select min(R_value) from have b
where a.probt =b.probt
and tie ='yes');
Upvotes: 1
Reputation: 63424
Assuming the probt
values are truly identical as they are in your example, you can do something as simple as using the last.
variable (also assuming they're sorted in order, if not use proc sort
first):
data want;
set have;
by descending probt r_value;
if last.probt; *if it is the last record from any set of identical probt values, keep it;
run;
If the probt
values are rounded and not truly identical, you need to make a variable first which is truly identical (using round). If you already computed tie
you may have done this already.
Upvotes: 1