Shin
Shin

Reputation: 107

group_by to manipulate several uniques

I create data by:

d <- data_frame(ID = rep(sample(500),each = 20))

I want to create a new column for each of 5 consecutive unique ID's. For this example it seems easy as the length of each ID is fixed. so simply:

d = d %>% mutate(new_col = rep(sample(100), each = 100))

gets consecutive 5 unique ID's. However I generate not fixed 20 ID's. I didn't add that part as it needs other long functions.

My question is simply after we have ID's, I want to take each of 5 consecutive unique ID's and create another column for each of these ID's. I believe group_by might be helpful, but I am not sure how to use it.

Upvotes: 1

Views: 39

Answers (1)

akuiper
akuiper

Reputation: 215117

You might need this:

d <- d %>% mutate(new_col = cumsum(ID - lag(ID, default = first(ID)) != 0) %/% 5)

Basically, ID - lag(ID, default = first(ID)) != 0 evaluates to TRUE whenever there is an ID change. Doing a cumsum on the vector gives a rleid (take a look at this answer for more info) of the ID column such as 0 0 0 1 1 1 2 2 2. Since you want every five IDs to have the same ID in the new column, do a modular division by 5.

table(d$new_col)

  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24 
100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 
 25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49 
100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 
 50  51  52  53  54  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71  72  73  74 
100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 
 75  76  77  78  79  80  81  82  83  84  85  86  87  88  89  90  91  92  93  94  95  96  97  98  99 
100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 

This should also work if IDs have different lengths.

Upvotes: 3

Related Questions