Reputation: 85
I am a beginner in R, so I am sorry if my question is too basic, but I would really appreciate some help in this.
mydata <-
structure(list(Col1 = c(17, 28, 80, 63, 20,
10), Col2 = c(18, 27, 89, 62, 24,
11), Col3 = c(25, 40, 80, 65, 23,
11), Col4 = c(27, 29, 100, 72, 34,
6)), class = "data.frame",
row.names = c("row1", "row2", "row3", "row4", "row5",
"row6"))
I would like to add a new column 'X'. For 'X', I would like to assign A for Row 1-2, B for row 3-4, C for row 5 and D for row 6.
The code I tried is..
mydata$X[mydata[c(1:2),]]<-A
mydata$X[mydata[c(3:4),]]<-B
mydata$X[mydata[c(5),]]<-C
mydata$X[mydata[c(6),]]<-D
I tried putting "" e.g. "A" when I am assigning letters, but couldn't get it to work.
I got error message:
invalid subscript type 'list'
So, I tried unlisting my data, but still did not work.
Can anybody help please?
Upvotes: 0
Views: 268
Reputation: 182
r2evans has answered the original question completely
This is a new and unclear classification question: "I wanted to classify them into four different groups (A, T, C, G) according to the start of my sequences.
That too appears to be answered by r2evans: mydata$X[1:2] <- "A"
Extendable to: mydata$X <- c(rep("A",2), rep("B",2),rep("C",1),rep("D",1))
Ronak's recent answer of course is more eloquent!
Upvotes: 2
Reputation: 388982
You can use case_when
from dplyr
. We use grepl
to detect the pattern based on start of sequence and assign values accordingly.
library(dplyr)
mydata %>%
#If the value starts with "AAT" assign "A"
mutate(X = case_when(grepl('^AAT', column) ~ 'A',
#If the value starts with "ABC" assign "B"
grepl('^ABC', column) ~ 'B',
#More cases
#More cases
#If none of them satisfy assign `NA`
TRUE ~NA_character_))
Instead of grepl
you can also use startsWith
or str_detect
.
Upvotes: 2