nanicpc
nanicpc

Reputation: 27

R - Create column from conditional operation over other multiple columns

I'm trying to create a new column with the following logic: for each row if the row name is 's' the new column take the value of the sum of non empty columns (in a subset) unless the sum is zero, in which case it should take the value of the v1 column.

An example of the data.frame

  name   year   v1         COL1   COL2 COL3 COL4 COL5 COL6 COL7
 1  S   1960   0.00  0.000000000   NA   NA   NA   NA   NA   NA ...
 2  S   1961   0.00  0.000000000   NA   NA   NA   NA   NA   NA ...
 3  S   1962   0.00  0.000000000   NA   NA   NA   NA   NA   NA ...
 4  S   1963   0.00  0.000000000   NA   NA   NA   NA   NA   NA ...
 5  N   1964   0.00  0.000000000   NA   NA   55   NA   22   NA ...
 6  S   1965  -0.13 -0.160505952   NA   NA   NA   NA   NA   45 ...

ETC. An example of the code that i was trying:

df$nexcol <- 0
df$nexcol[df["name"]=="S"] <-ifelse(rowSums(!is.na(df[df["name"]=="S",5:106]))==0,
                             df[df["name"]=="S","V1"],rowSums(!is.na(df[df["name"]=="S",5:106])))
df$nexcol[df["name"]=="N"] <- 1

The expected result is

   name   year   v1         COL1   COL2 COL3 COL4 COL5 COL6 COL7 ... nexcol
 1  S   1960   2.00  0.000000000   NA   NA   NA   NA   NA   NA ...    2.00
 2  S   1961   3.00  0.000000000   NA   NA   NA   NA   NA   NA ...    3.00
 3  S   1962   4.00  0.000000000   52   NA   NA   NA   10   NA ...    2
 4  S   1963   2.00  0.000000000   NA   NA   NA   NA   NA   NA ...    2.00
 5  N   1964   3.00  0.000000000   NA   NA   55   NA   22   NA ...    1
 6  S   1965  -0.13 -0.160505952   NA   NA   NA   NA   NA   45 ...    1

Upvotes: 0

Views: 143

Answers (1)

akuiper
akuiper

Reputation: 214927

Try something like this, close to what you have tried but more in R way:

df$nexcol = rowSums(!is.na(df[5:10]))
df$nexcol = with(df, ifelse(name == 'N', 1, ifelse(nexcol == 0, v1, nexcol)))

df
#   name year    v1      COL1 COL2 COL3 COL4 COL5 COL6 COL7 nexcol
# 1    S 1960  2.00  0.000000   NA   NA   NA   NA   NA   NA      2
# 2    S 1961  3.00  0.000000   NA   NA   NA   NA   NA   NA      3
# 3    S 1962  4.00  0.000000   52   NA   NA   NA   10   NA      2
# 4    S 1963  2.00  0.000000   NA   NA   NA   NA   NA   NA      2
# 5    N 1964  3.00  0.000000   NA   NA   55   NA   22   NA      1
# 6    S 1965 -0.13 -0.160506   NA   NA   NA   NA   NA   45      1

Or use dplyr package:

library(dplyr)
df %>% mutate(nexcol = rowSums(!is.na(.[5:10])), 
              nexcol = case_when(.$name == 'N' ~ 1, .$nexcol == 0 ~ .$v1, TRUE ~ .$nexcol))

#   name year    v1      COL1 COL2 COL3 COL4 COL5 COL6 COL7 nexcol
# 1    S 1960  2.00  0.000000   NA   NA   NA   NA   NA   NA      2
# 2    S 1961  3.00  0.000000   NA   NA   NA   NA   NA   NA      3
# 3    S 1962  4.00  0.000000   52   NA   NA   NA   10   NA      2
# 4    S 1963  2.00  0.000000   NA   NA   NA   NA   NA   NA      2
# 5    N 1964  3.00  0.000000   NA   NA   55   NA   22   NA      1
# 6    S 1965 -0.13 -0.160506   NA   NA   NA   NA   NA   45      1

Upvotes: 1

Related Questions