Reputation: 1305
I have two columns that I wish to combine. I would usually sum both columns. However, in this case I need to retain the original NAs and only have the numerical values combined from two columns to one. I have tried to use ifelse statements but since im working with 2x columns then combining both is trickier with nested ifelse.
Heres my example data:
# Example
data <- c(80,7.692307692,
7.692307692
,8.333333333
,9.090909091
,20
,27.27272727
,50
,50
,21.42857143
,58.33333333
,46.66666667
,78.06451613
,186.15384615
,42.85714286
,44.1860465,
20,
25,
40,45,78,55)
df <- data.frame(data)
df$long <- ifelse(df$data <20,1,0) # print 1s
df$long_exit <- df$data <=70 # logical
df$long_sigs <- ifelse(df$long_exit == FALSE,0,NA) # convert FALSE to 1 and retain NAs
desired_sum_output <- c(0,1,1,1,1,NA,NA,NA,NA,NA,NA,NA,0,0,NA,NA,NA,NA,NA,NA,0,NA)
df <- data.frame(df,desired_sum_output)
#sum df$long + df$long_sigs to and retain 0
sums <- df$long + df$long_sigs # summing
> df
data long long_exit long_sigs desired_sum_output
1 80.000000 0 FALSE 0 0
2 7.692308 1 TRUE NA 1
3 7.692308 1 TRUE NA 1
4 8.333333 1 TRUE NA 1
5 9.090909 1 TRUE NA 1
6 20.000000 0 TRUE NA NA
7 27.272727 0 TRUE NA NA
8 50.000000 0 TRUE NA NA
9 50.000000 0 TRUE NA NA
10 21.428571 0 TRUE NA NA
11 58.333333 0 TRUE NA NA
12 46.666667 0 TRUE NA NA
13 78.064516 0 FALSE 0 0
14 186.153846 0 FALSE 0 0
15 42.857143 0 TRUE NA NA
16 44.186047 0 TRUE NA NA
17 20.000000 0 TRUE NA NA
18 25.000000 0 TRUE NA NA
19 40.000000 0 TRUE NA NA
20 45.000000 0 TRUE NA NA
21 78.000000 0 FALSE 0 0
22 55.000000 0 TRUE NA NA
This is usually reasonable:
df$z <- rowSums(df[, c("long", "long_sigs")], na.rm=TRUE)
However i lose my NA positioning.
Reason for retaining NA's is I will use na.locf from zoo packaged to forward fill the NA values.
Upvotes: 1
Views: 53
Reputation: 33772
If I understand correctly, you want to sum with na.rm = TRUE
where long = 1
, otherwise retain the NA. So this should create z = desired_sum_output
:
df$z <- ifelse(df$long == 1,
rowSums(df[, c("long", "long_sigs")], na.rm=TRUE),
rowSums(df[, c("long", "long_sigs")]))
Upvotes: 3