Alexander
Alexander

Reputation: 4635

Mutate and ifelse() fail becase of NA existence in column

I've encountered an issue trying to create new column with ifelse. Quite similar question is this dplyr error: strange issue when combining group_by, mutate and ifelse. Is it a bug?

set.seed(101)
time =sort(runif(10,0,10))  
group=rep(c(1,2),each=5)
az=c(sort(runif(5,-1,1),decreasing = T),sort(runif(5,-1,0.2),decreasing = T))

df <- data.frame(time,az,group)

#       time          az group
#1  0.4382482  0.86326886     1
#2  2.4985572  0.75959146     1
#3  3.0005483  0.46394519     1
#4  3.3346714  0.41374948     1
#5  3.7219838 -0.08975881     1
#6  5.4582855 -0.01547669     2
#7  5.8486663 -0.29161632     2
#8  6.2201196 -0.50599980     2
#9  6.5769040 -0.73105782     2
#10 7.0968402 -0.95366733     2

in the df I am trying to conditional mutate clas column. However, since there is NA inside of sw_time all clas column becomes also NA in which group 1 should be nrm in usual way.

df1 <- df%>%
  group_by(group)%>%
  mutate(sw_time=abs(time[which(az<=0.8)[1]]-time[which(az>0)[1]]))%>%
  mutate(clas=as.numeric(ifelse(sw_time<3,"nrm","abn")))

Source: local data frame [10 x 5]
Groups: group [2]

        time          az group  sw_time  clas
       (dbl)       (dbl) (dbl)    (dbl) (dbl)
1  0.4382482  0.86326886     1 2.060309    NA
2  2.4985572  0.75959146     1 2.060309    NA
3  3.0005483  0.46394519     1 2.060309    NA
4  3.3346714  0.41374948     1 2.060309    NA
5  3.7219838 -0.08975881     1 2.060309    NA
6  5.4582855 -0.01547669     2       NA    NA
7  5.8486663 -0.29161632     2       NA    NA
8  6.2201196 -0.50599980     2       NA    NA
9  6.5769040 -0.73105782     2       NA    NA
10 7.0968402 -0.95366733     2       NA    NA

thanks in advance for your actions!

Upvotes: 1

Views: 393

Answers (1)

akrun
akrun

Reputation: 887088

By converting character class to numeric, it will result in NA. Instead, we may need to have a factor class that coerces to numeric

df %>%
    group_by(group)%>%
     mutate(sw_time=abs(time[which(az<=0.8)[1]]-time[which(az>0)[1]]),
            clas=as.integer(factor(ifelse(sw_time<3,"nrm","abn"))))

If we are only interested in getting 'nrm', 'abn', just remove the as.integer(factor wrapping

df%>%
  group_by(group)%>%
  mutate(sw_time=abs(time[which(az<=0.8)[1]]-time[which(az>0)[1]]),
          clas=ifelse(sw_time<3,"nrm","abn"))
#        time          az group  sw_time  clas
#       <dbl>       <dbl> <dbl>    <dbl> <chr>
#1  0.4382482  0.86326886     1 2.060309   nrm
#2  2.4985572  0.75959146     1 2.060309   nrm
#3  3.0005483  0.46394519     1 2.060309   nrm
#4  3.3346714  0.41374948     1 2.060309   nrm
#5  3.7219838 -0.08975881     1 2.060309   nrm
#6  5.4582855 -0.01547669     2       NA  <NA>
#7  5.8486663 -0.29161632     2       NA  <NA>
#8  6.2201196 -0.50599980     2       NA  <NA>
#9  6.5769040 -0.73105782     2       NA  <NA>
#10 7.0968402 -0.95366733     2       NA  <NA>

We can also use data.table

library(data.table)
setDT(df)[, c("sw_time", "clas") := {
           v1 <- abs(time[which(az <= 0.8)[1]] - time[which(az > 0)[1]])
          .(v1 , c("abn", "nrm")[(v1 < 3) + 1]) },
                      by = group]

If the final output does not involve 'nrm', 'abn', we don't need the ifelse part. We can directly use as.integer(sw_time <3)

Upvotes: 2

Related Questions