Evans Rono
Evans Rono

Reputation: 21

How can I add a new column meeting conditions of columns and a row of data

I have a data frame, call it df, that I want to add a new column "dsn" with two categorical groups "s" and "ns", based on conditions in columns 2 to 6 and row 1.

df:

df <- data.frame(refS_aa = c("", "N",   "L",    "T",    "T" ,"R",   "T",    "Q",    "T",    "N"),
             AAT = c("N",   38404, 0, 0, 0, 0, 31, 0, 0,38389),
             ACA    = c("T",    0,  0,  38387,  9,  7,  2,  351225, 0,  0),
             ACC    = c("T",    66, 0,  0,  38115,  0,  1,  0,  1,  0),
             ACG    = c("T",    0,  0,  0,  4,  0,  0,  0,  0,  0),
             ACT    = c("T",    0,  0,  0,  93, 0,  38304,  0,  38279,  0))

rownames(df) <- c("Used_aa",    "V1",   "V2",   "V3",   "V4",   "V5",   "V6",   "V7",   "V8",   "V9")

The df shows number of observations sharing the amino acids in column "refS_aa" and row 1 "Used_aa". My real data have over 7,000 observations.

The new column "dsn" should categorize the numerical values in df into "s" and "ns" factors based on conditions of row "Used_aa" and columns "2 to 6".

That is::

  1. "s" should have Column 2 to 6 >0 & refS_aa = Used_aa

  2. "ns" should have Column 2 to 6 >0 & refS_aa ≠ Used_aa

I have search solutions everywhere I could and tried different tricks including:

df$dsn[(df[2:6]) >0 ] & df[,1] == df[1,] <- "s"
df$dsn[(df[2:6]) >0 ] & df[,1] != df[1,] <- "ns"

But I have not succeeded.

I will appreciate any tricks!

Upvotes: 0

Views: 41

Answers (1)

TarJae
TarJae

Reputation: 79276

your data

df <- data.frame(refS_aa = c("", "N",   "L",    "T",    "T" ,"R",   "T",    "Q",    "T",    "N"),
                 AAT = c("N",   38404, 0, 0, 0, 0, 31, 0, 0,38389),
                 ACA    = c("T",    0,  0,  38387,  9,  7,  2,  351225, 0,  0),
                 ACC    = c("T",    66, 0,  0,  38115,  0,  1,  0,  1,  0),
                 ACG    = c("T",    0,  0,  0,  4,  0,  0,  0,  0,  0),
                 ACT    = c("T",    0,  0,  0,  93, 0,  38304,  0,  38279,  0))

rownames(df) <- c("Used_aa",    "V1",   "V2",   "V3",   "V4",   "V5",   "V6",   "V7",   "V8",   "V9")

The code:

df1 <- df %>% 
  mutate(dsn = case_when(c(2:6) > 0 & refS_aa == "Used_aa" ~ "s",
                         c(2:6) > 0 & refS_aa != "Used_aa" ~"ns"))

gives this: enter image description here

Upvotes: 1

Related Questions