Reputation: 31
I wanted to exclude rows with participants who show error rates above 15% When I look at the error rate of participant 2, it is for example 2,97%
semdata[2,"error_rate"]
[1] "2,97"
But if I run this ifelse-statement, many participants get excluded that don´t display error rates (but others not, which is correct).
15% (e.g., this participant 2).
for(i in 1:NROW(semdata)){
#single trial blocks
ifelse((semdata[i,"error_rate"] >= 15),print(paste(i, "exclusion: error rate ST too high",semdata[i,"dt_tswp.err.prop_st"])),0)
ifelse((semdata[i,"error_rate"] >= 15),semdata[i,6:NCOL(semdata)]<-NA,0)
#dual-task blocks
# ifelse((semdata[i,"error_rate"] >= 15),print(paste(i, "exclusion: error rate DT too high")),0)
# ifelse((semdata[i,"error_rate"] >= 15),semdata[i,6:NCOL(semdata)]<-NA,0)
}
[1] "1 exclusion: error rate ST too high 6,72"
[1] "2 exclusion: error rate ST too high 2,97"
[1] "7 exclusion: error rate ST too high 2,87"
[1] "9 exclusion: error rate ST too high 5,28"
...
What am I doing wrong here?
Upvotes: 0
Views: 74
Reputation: 388982
You are comparing strings here.
"6,72" > 15
#[1] TRUE
You should convert the data to numeric first before comparing which can be done by using sub
as.numeric(sub(",", ".", "6,72"))
#[1] 6.72
This can be compared with 15.
as.numeric(sub(",", ".", "6,72")) > 15
#[1] FALSE
For the entire column you can do -
semdata$error_rate <- as.numeric(sub(",", ".", semdata$error_rate))
Upvotes: 1