Nabi Shaikh
Nabi Shaikh

Reputation: 851

Generate a sequence number (1,1,1,2,2,2,3,3,3) within groups of different length

I have a data frame with a column "Tag", here with four different levels. I need help to create the "Seq" column, a sequence generated from the "Tag" Column:

df <- data.frame(Tag = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4),
                 Seq = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3 )

Each "Tag" should be divided into 3 sub-groups defined by "Seq". We need to generate runs of 1, 2, and 3, with a total length of that of each "Tag". Thus, the length of each run of 1, 2, and 3 respectively depends on length of each "Tag".

Note that the length each "Tag" differs. For example, Tag 1 is of length 31, and has a "Seq" 10 times 1, 10 times 2, and 11 times 3.

Upvotes: 0

Views: 687

Answers (2)

Henrik
Henrik

Reputation: 67778

ave(Tag, Tag, FUN = function(x){sort(rep(x = 1:3, length.out = length(x)))})

Explanation: For each level of "Tag" (ave(Tag, Tag, ...): repeat each level of "Seq" (x = 1:3) to the length of the subset of "Tag" (length.out = length(x)). sort the numbers.

Upvotes: 1

Onyambu
Onyambu

Reputation: 79188

To begin with, Tag 1 is 31 while tag 2 is 32. Looking at the code below, the first number (1) will always be of lesser length than the next two (2,3). I used a ceiling process to come up with this. There is no clear criteria on what the code should do if the number is eg 31/3.. should it give a length of 10, 10, 11? or even 9, 11,11 will be fine? The code gives a 9, 11, 11 length:

 ec=table(Tag)
 unlist(mapply(function(x,y)rep(c(1,2,3),c(x,y,y)),ec-2*ceiling(ec/3),ceiling(ec/3)))

To check the outputted results, save the results in a variable.. d=mapply(... then do sapply(d,table). Hope this will be of help.

Upvotes: 1

Related Questions