Sammie
Sammie

Reputation: 161

Computing new variable based on the values of pre-existing variables

I am trying to create three new variables based off of existing variables within the data. I am trying to create: Market_A (numeric), Segment_A (string) and Sub-Segment_A (string).

I'm trying to base these off of existing variables: Market (numeric), Segment (string), Sub-Segment (string), Email (string) and Week (numeric).

My data has a list of email addresses that can be in weeks 1-6. Their Market, Segment and Sub-Segment can change based on the week they appear. (ie Jane can appear in Week 1 as USA but Week 6 as Canada).

My goal in the new columns is to put only their Market, Segment and Sub-segment from Week 6 every time their email address shows up (even in another week).

If they don't ever show up in Week 6, then I do not want anything to appear in my three new columns.

For example:

enter image description here

Thanks!

Upvotes: 1

Views: 81

Answers (1)

eli-k
eli-k

Reputation: 11350

I assume the data is repeated if week 6 appears more than once - or that week 6 doesn't appear more than once.

if week=6 market_A=market.
string segment_A subsegment_A (a5).
if week=6 segment_A=segment.
if week=6 subsegment_A=subsegment.

aggregate outfile=* mode=addvariables overwrite=yes/break email
   /market_A segment_A subsegment_A=max(market_A segment_A subsegment_A)

Upvotes: 1

Related Questions