Reputation: 505
I have a data.frame containing numerics. I want to create a new column within that data.frame that will house factor labels using (letters[]
). I want these factor labels to be built from a sequence of numbers that I have, and can change every time.
For example, my original DF has 1 column x
containing numerics, I then have a sequence of numbers (3,7,9). So I need the new FLABEL
column to populate according to the number sequence, i.e. first 3 lines are a
, next 4 lines b
and so on.
x FLABEL
0.23 a
0.21 a
0.19 a
0.27 b
0.25 b
0.22 b
0.15 b
0.09 c
0.32 c
0.19 d
0.17 d
I'm struggling with how to do this, I'm assuming some form of for-loop given that my number sequence can vary in length every time I run it So I could be populating letters a & b...or many more.
Upvotes: 1
Views: 395
Reputation: 23798
Based on the comment by @scoa, I suggest the following modified approach:
series <- c(3, 7, 9)
series <- c(series, nrow(DF)) # This ensures that the sequence extends to the last row of DF
series2 <- c(series[1] ,diff(series))
DF$FLABEL <- rep(letters[1:length(series2)], series2)
#> DF
# x FLABEL
#1 0.23 a
#2 0.21 a
#3 0.19 a
#4 0.27 b
#5 0.25 b
#6 0.22 b
#7 0.15 b
#8 0.09 c
#9 0.32 c
#10 0.19 d
#11 0.17 d
By using diff()
the length of each sequence is calculated based on the index numbers in the input vector series
. In this case, the index values 3, 7, 9 are converted into the number of repetitions of subsequent letters up to the last row of the data frame and stored in series2
: 3, 4, 2, 2.
data
text <- "x FLABEL
0.23 x
0.21 x
0.19 x
0.27 x
0.25 x
0.22 x
0.15 x
0.09 x
0.32 x
0.19 x
0.17 x"
DF <- read.table(text = text, header=T)
Upvotes: 1