Reputation: 21
I have a data frame, call it df, that I want to add a new column "dsn" with two categorical groups "s" and "ns", based on conditions in columns 2 to 6 and row 1.
df:
df <- data.frame(refS_aa = c("", "N", "L", "T", "T" ,"R", "T", "Q", "T", "N"),
AAT = c("N", 38404, 0, 0, 0, 0, 31, 0, 0,38389),
ACA = c("T", 0, 0, 38387, 9, 7, 2, 351225, 0, 0),
ACC = c("T", 66, 0, 0, 38115, 0, 1, 0, 1, 0),
ACG = c("T", 0, 0, 0, 4, 0, 0, 0, 0, 0),
ACT = c("T", 0, 0, 0, 93, 0, 38304, 0, 38279, 0))
rownames(df) <- c("Used_aa", "V1", "V2", "V3", "V4", "V5", "V6", "V7", "V8", "V9")
The df shows number of observations sharing the amino acids in column "refS_aa" and row 1 "Used_aa". My real data have over 7,000 observations.
The new column "dsn" should categorize the numerical values in df into "s" and "ns" factors based on conditions of row "Used_aa" and columns "2 to 6".
That is::
"s" should have Column 2 to 6 >0 & refS_aa = Used_aa
"ns" should have Column 2 to 6 >0 & refS_aa ≠ Used_aa
I have search solutions everywhere I could and tried different tricks including:
df$dsn[(df[2:6]) >0 ] & df[,1] == df[1,] <- "s"
df$dsn[(df[2:6]) >0 ] & df[,1] != df[1,] <- "ns"
But I have not succeeded.
I will appreciate any tricks!
Upvotes: 0
Views: 41
Reputation: 79276
your data
df <- data.frame(refS_aa = c("", "N", "L", "T", "T" ,"R", "T", "Q", "T", "N"),
AAT = c("N", 38404, 0, 0, 0, 0, 31, 0, 0,38389),
ACA = c("T", 0, 0, 38387, 9, 7, 2, 351225, 0, 0),
ACC = c("T", 66, 0, 0, 38115, 0, 1, 0, 1, 0),
ACG = c("T", 0, 0, 0, 4, 0, 0, 0, 0, 0),
ACT = c("T", 0, 0, 0, 93, 0, 38304, 0, 38279, 0))
rownames(df) <- c("Used_aa", "V1", "V2", "V3", "V4", "V5", "V6", "V7", "V8", "V9")
The code:
df1 <- df %>%
mutate(dsn = case_when(c(2:6) > 0 & refS_aa == "Used_aa" ~ "s",
c(2:6) > 0 & refS_aa != "Used_aa" ~"ns"))
Upvotes: 1