Reputation: 83
Might be a very simple question to ask but I struggle to solve this problem in r. I have a dataset containing four variables: ID (for identifying the participants ), Type (with 1 value this time ), Decision (A or B) and Feedback (0 or 1). The data set for two participants looks like this:
ID Type Decision Feedback
1 1 A 0
1 1 A 0
1 1 B 1
1 1 B 1
1 1 B 0
2 1 A 0
2 1 A 1
2 1 A 1
2 1 A 0
2 1 B 1
etc...
I want to calculate the number of changes in the decision process as a function of the previous feedback. In other words, if the participant choose A and received a negative feedback, will she/he choose A again (Stay) or B (Shift). So my code is the following for one participant:
Stay=0
Shift=0
for(i in 2:length(mydf$Type)){
if(mydf$Decision[i] == "A" && mydf$Feedback[i-1]==1 && mydf$Decision [i-1] == "A" ){
Stay= Stay+1
}
else if(mydf$Decision [i] == "B" && mydf$Feedback[i-1]==1 && mydf$Decision [i-1] == "B" ){
Stay= Stay+1
}
else if(mydf$ Decision [i] == "A" && mydf$Feedback[i-1]==1 && mydf$Decision [i-1] == "B" ){
Shift= Shift+1
}
else if(mydf$Decision [i] == "B" && mydf$Feedback[i-1]==1 && mydf$Decision [i-1] == "A" ){
Shift= Shift+1
}
}
However, my data frame contains 20 participants and I don’t know how to extend my code to get the number of stays and shifts for each participant (i.e., to get something like this at the end):
#ID Stay Shift
#1 10 10
#2 16 4
#etc...
Thank you very much for your help in advance.
Upvotes: 3
Views: 197
Reputation: 44614
This is a slightly hairier alternative using the embed
function, as mentioned in the comments to @DavidRobinson's answer.
d<-read.table(text="ID Type Decision Feedback
1 1 A 0
1 1 A 0
1 1 B 1
1 1 B 1
1 1 B 0
2 1 A 0
2 1 A 1
2 1 A 1
2 1 A 0
2 1 B 1", header=TRUE)
do.call(rbind,
by(d, d$ID, function(x) {
f <- function(x) length(unique(x)) == 1
stay <- apply(embed(as.vector(x$Decision), 2), 1, f)
neg.feedback <- x$Feedback[1:nrow(x)-1] == 1
c(Stay = sum(stay & neg.feedback), Shift = sum((! stay) & neg.feedback))
})
)
# Stay Shift
# 1 2 0
# 2 2 0
Upvotes: 1
Reputation: 55350
How about a nice breakdown by ID and Feedback:
library(data.table)
X <- data.table(mydf, key="ID")
X[, list(Dif=abs(diff(as.numeric(Decision))),
FB=head(Feedback, -1))
, by=ID][,list(Shifted=sum(Dif), Stayed=length(Dif)-sum(Dif)), by=list(ID,FB)]
# ID FB Shifted Stayed
# 1: 1 0 1 1
# 2: 1 1 0 2
# 3: 2 0 1 1
# 4: 2 1 0 2
or if you don't want the breakdown by Feedback
, it is even more succinct:
X[ , {Dif=abs(diff(as.numeric(Decision)));
list(Shifted=sum(Dif), Stayed=length(Dif)-sum(Dif))}
, by=list(ID)]
# ID Shifted Stayed
# 1: 1 1 3
# 2: 2 1 3
Upvotes: 1
Reputation: 78600
This is best done using ddply
in the plyr package (you'll have to install it), which splits up a data frame based on one of the columns and does some analysis on each, before recombining into a new data frame.
First, write a function num.stay.shift
that calculates your stay and shift values given a single subset of the data frame (explained in comments):
num.stay.shift = function(d) {
# vector of TRUE or FALSE for whether d$Feedback is 1
negative.feedback = (head(d$Feedback, -1) == 1)
# vector of TRUE or FALSE for whether there is a change at each point
stay = head(d$Decision, -1) == tail(d$Decision, -1)
# summarize as two values: the number that stayed when feedback == 1,
# and the number that shifted when feedback == 1
c(Stay=sum(stay[negative.feedback]), Shift=sum(!stay[negative.feedback]))
}
Then, use ddply
to apply that function to each of the individuals within the data frame, splitting it up by ID:
print(ddply(tab, "ID", num.stay.shift))
On the subset of the data frame you show, you would end up with
# ID Stay Shift
# 1 1 2 0
# 2 2 2 0
Upvotes: 3