Reputation: 61
I have a dataset where each ID has visited a website and recorded their risk level which is coded 0-3. They have then returned to the website at a future date and recorded their risk level again. I want to calculate the difference between each ID's risk level from their first recorded risk level.
For example my dataset looks like this:
ID Timestamp RiskLevel
1 20-Jan-21 2
1 04-Apr-21 2
2 05-Feb-21 1
2 12-Mar-21 2
2 07-May-21 3
3 09-Feb-21 2
3 14-Mar-21 1
3 18-Jun-21 0
And I would like it to look like this:
ID Timestamp RiskLevel DifFromFirstRiskLevel
1 20-Jan-21 2 .
1 04-Apr-21 2 0
2 05-Feb-21 1 .
2 12-Mar-21 2 1
2 07-May-21 3 2
3 09-Feb-21 2 .
3 14-Mar-21 1 -1
3 18-Jun-21 0 -2
What should I do?
Upvotes: 1
Views: 19
Reputation: 11360
One way to approach this is with the strategy in my answer here, but I will use a different approach here:
sort cases by ID timestamp. compute firstRisk=risklevel. if $casenum>1 and ID=lag(ID) firstRisk=lag(firstRisk). execute. compute DifFromFirstRiskLevel=risklevel-firstRisk.
Upvotes: 1