Reputation: 851
I have a data frame with a column "Tag", here with four different levels. I need help to create the "Seq" column, a sequence generated from the "Tag" Column:
df <- data.frame(Tag = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4),
Seq = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3 )
Each "Tag" should be divided into 3 sub-groups defined by "Seq". We need to generate runs of 1, 2, and 3, with a total length of that of each "Tag". Thus, the length of each run of 1, 2, and 3 respectively depends on length of each "Tag".
Note that the length each "Tag" differs. For example, Tag 1 is of length 31, and has a "Seq" 10
times 1
, 10
times 2
, and 11
times 3
.
Upvotes: 0
Views: 687
Reputation: 67778
ave(Tag, Tag, FUN = function(x){sort(rep(x = 1:3, length.out = length(x)))})
Explanation: For each level of "Tag" (ave(Tag, Tag, ...
): rep
eat each level of "Seq" (x = 1:3
) to the length of the subset of "Tag" (length.out = length(x)
). sort
the numbers.
Upvotes: 1
Reputation: 79188
To begin with, Tag 1 is 31 while tag 2 is 32. Looking at the code below, the first number (1) will always be of lesser length than the next two (2,3). I used a ceiling process to come up with this. There is no clear criteria on what the code should do if the number is eg 31/3.. should it give a length of 10, 10, 11? or even 9, 11,11 will be fine? The code gives a 9, 11, 11 length:
ec=table(Tag)
unlist(mapply(function(x,y)rep(c(1,2,3),c(x,y,y)),ec-2*ceiling(ec/3),ceiling(ec/3)))
To check the outputted results, save the results in a variable.. d=mapply(...
then do sapply(d,table)
.
Hope this will be of help.
Upvotes: 1