rbeginner
rbeginner

Reputation: 31

What is wrong with this ifelse-command? (rows get excluded that don´t match the if-statement)

I wanted to exclude rows with participants who show error rates above 15% When I look at the error rate of participant 2, it is for example 2,97%

semdata[2,"error_rate"]
[1] "2,97"

But if I run this ifelse-statement, many participants get excluded that don´t display error rates (but others not, which is correct).

15% (e.g., this participant 2).
for(i in 1:NROW(semdata)){
#single trial blocks
ifelse((semdata[i,"error_rate"] >= 15),print(paste(i, "exclusion: error rate ST too high",semdata[i,"dt_tswp.err.prop_st"])),0)
ifelse((semdata[i,"error_rate"] >= 15),semdata[i,6:NCOL(semdata)]<-NA,0)
#dual-task blocks
# ifelse((semdata[i,"error_rate"] >= 15),print(paste(i, "exclusion: error rate DT too high")),0)
# ifelse((semdata[i,"error_rate"] >= 15),semdata[i,6:NCOL(semdata)]<-NA,0)
}
[1] "1 exclusion: error rate ST too high 6,72"
[1] "2 exclusion: error rate ST too high 2,97"
[1] "7 exclusion: error rate ST too high 2,87"
[1] "9 exclusion: error rate ST too high 5,28"
...

What am I doing wrong here?

Upvotes: 0

Views: 74

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388982

You are comparing strings here.

"6,72" > 15
#[1] TRUE

You should convert the data to numeric first before comparing which can be done by using sub

as.numeric(sub(",", ".", "6,72"))
#[1] 6.72

This can be compared with 15.

as.numeric(sub(",", ".", "6,72")) > 15
#[1] FALSE

For the entire column you can do -

semdata$error_rate <- as.numeric(sub(",", ".", semdata$error_rate))

Upvotes: 1

Related Questions