student_R123
student_R123

Reputation: 1002

create a group variable in a data frame by a string variable starting from a certain value in R

I have following data frame.

sub1=c("2021","2121","M123","M143")
x1=c(10,5,6,7)
x2=c(11,12,34,56)
data=data.frame(sub1,x1,x2)

I need to get create a group variable for this data frame such that if the sub1 starts from number 2, then it will belongs to one group and if sub1 starts from letter M , it belongs to second group.

My desired output should be like this,

sub1 x1 x2 group
1 2021 10 11 1
2 2121  5 12 1
3 M123  6 34 2
4 M143  7 56 2

can anyone suggest any funstion that i use for this ? I tried grep funstion as follows, but i didnt get the desired result.

data$sub1[grep("^[2].*", data$sub1)]

Thank you

Upvotes: 1

Views: 45

Answers (3)

tmfmnk
tmfmnk

Reputation: 39858

You can also do:

as.integer(!grepl("^2", data$sub1)) + 1

[1] 1 1 2 2

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 388817

Another way using substring and indexing to assign groups.

data$group <- (substr(data$sub1, 1, 1) == "M") + 1

data
#  sub1 x1 x2 group
#1 2021 10 11     1
#2 2121  5 12     1
#3 M123  6 34     2
#4 M143  7 56     2

Or extract first character using regex

sub("(.).*", "\\1", data$sub1)
#[1] "2" "2" "M" "M"

and then use the same method to assign groups

(sub("(.).*", "\\1", data$sub1) == "M") + 1
#[1] 1 1 2 2

Upvotes: 1

s__
s__

Reputation: 9485

What about this:

data$group <- ifelse(substr(data$sub1,1,1)==2,1,2)

data
  sub1 x1 x2 group
1 2021 10 11     1
2 2121  5 12     1
3 M123  6 34     2
4 M143  7 56     2

In case you do not know if it could be other cases than 2 or M:

ifelse(substr(data$sub1,1,1)==2,1,ifelse(substr(data$sub1,1,1)=='M',2,'Missing'))

Upvotes: 2

Related Questions