Rstudyer
Rstudyer

Reputation: 477

Assign numeric order to each grouped variable subset in R

I have a dataset like this:

var<-c("df1::x2","df1::x2","df1::x2","df1::x6","df1::6","df1::x7","df1::x7","df1::x7")
mValue<-c("3months","6months","12months","yes","no", "male","female","i dont't know")
valueOrder<-"NA"
dat<-data.frame(var, mValue, valueOrder)

Now what I want to do is to assign numeric order within each grouped vars (x2, x6 and x7), ideally, it should look like this:

var             mValue         valueOrder
df1::x2         3months             1
df1::x2         6months             2
df1::x2         12months            3
df1::x6         yes                 1
df1::x6         no                  2
df1::x7         male                1
df1::x7         female              2
df1::x7         i don't know        3

I am not sure whether I need to use group_by to group "var" first, and then use some functions to assign numeric order, or there is smarter and more efficient way to handle that. Could Someone help to tell how I can achieve this outcome? Thanks a lot~!

Upvotes: 1

Views: 30

Answers (2)

jay.sf
jay.sf

Reputation: 72828

Using ave to cumulative count duplicates of var by var group.

transform(dat, valueOrder=ave(var, var, FUN=\(x) cumsum(duplicated(x)) + 1))
#       var        mValue valueOrder
# 1 df1::x2       3months          1
# 2 df1::x2       6months          2
# 3 df1::x2      12months          3
# 4 df1::x6           yes          1
# 5 df1::x6            no          2
# 6 df1::x7          male          1
# 7 df1::x7        female          2
# 8 df1::x7 i dont't know          3

Data:

dat <- structure(list(var = c("df1::x2", "df1::x2", "df1::x2", "df1::x6", 
"df1::x6", "df1::x7", "df1::x7", "df1::x7"), mValue = c("3months", 
"6months", "12months", "yes", "no", "male", "female", "i dont't know"
), valueOrder = c("NA", "NA", "NA", "NA", "NA", "NA", "NA", "NA"
)), class = "data.frame", row.names = c(NA, -8L))

Upvotes: 1

akrun
akrun

Reputation: 887118

Here is one way with rowid - remove the prefix part of the 'var' column using trimws till the :: and use rowid (from data.table to get the sequence)

library(dplyr)
library(data.table)
dat <- dat %>%
    mutate(valueOrder = rowid(trimws(var, whitespace = ".*::")))

-output

dat
       var        mValue valueOrder
1 df1::x2       3months          1
2 df1::x2       6months          2
3 df1::x2      12months          3
4 df1::x6           yes          1
5 df1::x6            no          2
6 df1::x7          male          1
7 df1::x7        female          2
8 df1::x7 i dont't know          3

Upvotes: 1

Related Questions