Reputation: 21
My data is as follows:
ID
1
2
3
3
4
5
6
6
I want to create a column that indicates the uniqueness of a value in the ID column as such:
ID COUNT
1 1
2 1
3 1
3 0
4 1
5 1
6 1
6 0
I'd like to do this without creating a temporary table, via a subquery or something. Any assistance would be much appreciated.
Upvotes: 0
Views: 88
Reputation: 51566
That type of logic is not really amenable to SQL since the order of observations is not really insured. In a more modern version of SQL you could use windowing functions (like ROW_NUMBER() with PARTITION BY) to impose an record count.
If you really wanted to try to do it just in PROC SQL you might need to resort to using the undocumented MONOTONIC()
function. But even then to defeat the optimizer eliminating the duplicate rows you might need to make a temporary table with the row counter first.
data have;
input ID @@;
datalines;
1 2 3 3 4 5 6 6
;
proc sql ;
create table _temp_ as select id,monotonic() as row from have;
create table want as
select a.id
, b.row=min(b.row) as FLAG
from have a,_temp_ b
where a.id=b.id
group by a.id
order by 1,2
;
quit;
Upvotes: 0
Reputation: 489
One option would be to use by functionality in the data step:
data have;
input ID;
datalines;
1
2
3
3
4
5
6
6
;run;
data want;
set have;
by ID;
if first.ID then count = 1;
else count = 0;
run;
Upvotes: 2