Reputation: 161
I'm trying to count the number of times an email appears in my data if it appears in Week 6. I'm currently using this code
if week=6 has6=1.
aggregate out=* mode=addvariables overwrite=yes/break=email /has6=max(has6).
aggregate out=* mode=addvariables /break=email /n=sum(has6).
This is working as expected but I left out one condition that I need. Email addresses can show up multiple times per week, but I only want to count once per week.
For example:
Email Week N
[email protected] 6 2
[email protected] 3
[email protected] 4
[email protected] 6 1
[email protected] 4 2
[email protected] 4 2
So even though [email protected] appears in the data three times, I only want her to show as N=2 because she appears only two times in different weeks (6 and 4). I do not want it to count her second instance in week 4 (which is what the current code is doing: She is being counted as a 3 because she appears three times).
So my two conditions are:
TIA!
Upvotes: 2
Views: 69
Reputation: 11360
This will recreate the example data you gave:
data list list/email(a50) week(f1).
begin data
"[email protected]" 6
"[email protected]" 3
"[email protected]" 4
"[email protected]" 6
"[email protected]" 4
"[email protected]" 4
end data.
First we'll identify each email that had week 6 at least once, and mark all it's occurences :
if week=6 has6=1.
aggregate out=* mode=addvariables overwrite=yes/break=email /has6=max(has6).
Now there are two ways to proceed.
First method: Second aggregate and reattach results to data:.
sort cases by email week.
dataset name orig.
dataset declare agg.
aggregate out=agg /break=email week/has6=max(has6).
dataset activate agg.
select if has6.
aggregate out=* mode=addvariables/break email/n=n.
dataset activate orig.
match files /file=* /table=agg /by email week.
exe.
Second method: Comparing rows to discount non-unique cases:
sort cases by email week.
compute countThis=has6.
if $casenum>1 and has6 and lag(email)=email and lag(week)=week countThis=0.
exe.
aggregate out=* mode=addvariables /break=email /n=sum(countThis).
Upvotes: 1