Reputation: 99
I have a dataset where I have different names in one column, the names can be duplicate. My task here is to compare each and every name with the rest of the names in the column.For example if I take the name 1 "Vishal" I have to compare it with all the names from 2 to 13. If there is a matching name from row 2 to 13 there will be different column made "flag" with value of Y if there is a duplicate if no duplicate then a value of N.I have to perform this operation with all the names in the group
I have written a code which looks like this:
data Name;
input counter name $50.;
cards;
1 vishal
2 swati
3 sahil
4 suman
5 bindu
6 bindu
7 vishal
8 tushar
9 sahil
10 swati
11 gudia
12 priyansh
13 priyansh
;
proc sql;
select count(name) into: n from swati;
quit;
proc sql;
select name into: name1 -:name13 from swati;
quit;
options mlogic mprint symbolgen;
%macro swati;
data name1;
set swati;
%do i = 1 %to 1;
%do j= %eval(&i.+1) %to &n.;
if &&name&i. =&&name&j. then flag="N";
else flag="Y";
%end;
%end;
run;
%mend;
%swati;
the code gives me the vale N for all the names even if there is a name matching, also it makes a different variable with using all the variable names.*
The desired output is shown below
Name Flag
vishal N
swati N
sahil N
suman Y
bindu N
bindu Y
vishal Y
tushar Y
sahil Y
swati Y
gudia Y
priyansh N
priyansh Y
So basically we started finding vishal (the first name) from 2 to 13 and see if there is a duplicate, if there is the flag is N i.e. there is a duplicate. Let us see the name "Suman" which is the fourth name in the list, and we start searching for its matching from 5 to 13. Since there isn't any duplicate for that we have flagged it as "Y".
WE HAVE TO DO THIS USING A DO LOOP
Upvotes: 0
Views: 447
Reputation: 51566
Your answer for the last observation does not look right. Is there another condition such that if it is the last record the flag should be 'N' instead of 'Y'?
I really see no reason why you have to use a DO loop. But you could place a DO loop around a SET statement with the POINT= option to look for matching names.
data want ;
set name nobs=nobs ;
length next $50;
next=' ';
do p=_n_+1 to nobs until (next=name) ;
set name(keep=name rename=(name=next)) point=p;
end;
if next=name then flag='N'; else flag='Y';
drop next;
run;
You could also take advantage of the COUNTER variable and do it using GROUP BY
in a SELECT
statement in PROC SQL
.
proc sql ;
create table want2 as
select *
, case when (counter = max(counter)) then 'Y' else 'N' end as flag
from name
group by name
order by counter
;
quit;
Upvotes: 0
Reputation: 21274
Resort by Order if desired
proc sort data=name;
by name;
run;
data want;
set name;
by name;
if first.name and last.name then unique='Y';
else unique='N';
run;
proc sort data=want;
by counter;
run;
Upvotes: 2