Reputation: 33
A five columns table ("id", "othermood_v","rass_v", "gcs_v" and "cam_v") with around 52000 rows. There are three values in the last column ("cam_v"):0,1,2 as a class label. The "cam_v" column now has value as 1, 2 and NA. I would like to replace the NA value with either 0 or 1 based the other three columns "othermood_v", "rass_v" and "gcs_v". So if any of these three columns in the same row has a value like 1, then cam_v would be labeled as 1, otherwise 0. I tried to loop through data with a condition like
if df$othermood_v>0|df$rass_v>0|df$gcs_v >0, then df$cam_v=1 else
0, rm NA = True
or
if (df$othermood_v+df$rass_v+df$gcs_v) >0, then cam_v=1 else 0
But I don't know how to get it to work. Any suggestions? BTW, the id is unique now. Thanks.
id othermood_v rass_v gcs_v cam_v
100078 0 0 0 NA
100079 0 0 0 NA
100081 0 0 0 NA
100085 1 1 0 NA
100087 1 1 0 NA
100088 1 0 0 NA
100091 1 1 1 2
100094 0 1 0 NA
100095 1 0 0 NA
100096 0 0 0 NA
100098 1 1 1 2
100099 0 1 0 NA
100102 1 0 0 NA
100103 1 0 0 NA
100104 1 1 0 2
100106 0 0 0 NA
100108 1 0 0 NA
100109 1 0 0 NA
100112 1 0 0 NA
100113 1 1 1 1
100114 1 0 0 NA
100116 1 0 0 NA
100117 1 0 0 NA
100118 0 1 0 NA
Upvotes: 2
Views: 393
Reputation: 525
You were close with your method, You just needed to change how you were doing if else. The below should work:
df$cam_v<-ifelse((df$othermood_v>0|df$rass_v>0|df$gcs_v >0), 1,0)
Upvotes: 0
Reputation: 887941
We create a logical vector and then replace using another condition created with rowSums
i1 <- is.na(df1$cam_v) # logical index of NA elements in 'cam_v'
# assign the values 0 or 1 based on the occurrence of 1 in
# either one of the columns from 2 to 4
df1$cam_v[i1] <- +(rowSums(df1[i1, 2:4] > 0) > 0)
df1 <- structure(list(id = c(100078L, 100079L, 100081L, 100085L, 100087L,
100088L, 100091L, 100094L, 100095L, 100096L, 100098L, 100099L,
100102L, 100103L, 100104L, 100106L, 100108L, 100109L, 100112L,
100113L, 100114L, 100116L, 100117L, 100118L), othermood_v = c(0L,
0L, 0L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 0L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 0L), rass_v = c(0L, 0L, 0L, 1L, 1L, 0L,
1L, 1L, 0L, 0L, 1L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 0L, 0L,
0L, 1L), gcs_v = c(0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 1L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L), cam_v = c(NA,
NA, NA, NA, NA, NA, 2L, NA, NA, NA, 2L, NA, NA, NA, 2L, NA, NA,
NA, NA, 1L, NA, NA, NA, NA)), class = "data.frame", row.names = c(NA,
-24L))
Upvotes: 1
Reputation: 12478
A solution using dplyr
library(dplyr)
df_clean <- df %>%
mutate(cam_v = ifelse(!is.na(cam_v), cam_v,
ifelse((othermood_v + rass_v + gcs_v) > 0, 1, 0)))
> df_clean
id othermood_v rass_v gcs_v cam_v
1 100078 0 0 0 0
2 100079 0 0 0 0
3 100081 0 0 0 0
4 100085 1 1 0 1
5 100087 1 1 0 1
6 100088 1 0 0 1
7 100091 1 1 1 2
8 100094 0 1 0 1
9 100095 1 0 0 1
10 100096 0 0 0 0
11 100098 1 1 1 2
12 100099 0 1 0 1
13 100102 1 0 0 1
14 100103 1 0 0 1
15 100104 1 1 0 2
16 100106 0 0 0 0
17 100108 1 0 0 1
18 100109 1 0 0 1
19 100112 1 0 0 1
20 100113 1 1 1 1
21 100114 1 0 0 1
22 100116 1 0 0 1
23 100117 1 0 0 1
24 100118 0 1 0 1
Generally, it is preferred here to use dput(head(data, 20))
to provide sample data for your code. I used this to transform yours data:
df <- read.table(text =
"id othermood_v rass_v gcs_v cam_v
100078 0 0 0 NA
100079 0 0 0 NA
100081 0 0 0 NA
100085 1 1 0 NA
100087 1 1 0 NA
100088 1 0 0 NA
100091 1 1 1 2
100094 0 1 0 NA
100095 1 0 0 NA
100096 0 0 0 NA
100098 1 1 1 2
100099 0 1 0 NA
100102 1 0 0 NA
100103 1 0 0 NA
100104 1 1 0 2
100106 0 0 0 NA
100108 1 0 0 NA
100109 1 0 0 NA
100112 1 0 0 NA
100113 1 1 1 1
100114 1 0 0 NA
100116 1 0 0 NA
100117 1 0 0 NA
100118 0 1 0 NA", header = TRUE)
Upvotes: 1