PaulBeales
PaulBeales

Reputation: 505

Create factor labels in a DF using a sequence of numbers

I have a data.frame containing numerics. I want to create a new column within that data.frame that will house factor labels using (letters[]). I want these factor labels to be built from a sequence of numbers that I have, and can change every time.

For example, my original DF has 1 column x containing numerics, I then have a sequence of numbers (3,7,9). So I need the new FLABEL column to populate according to the number sequence, i.e. first 3 lines are a, next 4 lines b and so on.

x       FLABEL
0.23     a
0.21     a
0.19     a
0.27     b
0.25     b
0.22     b
0.15     b
0.09     c
0.32     c
0.19     d
0.17     d

I'm struggling with how to do this, I'm assuming some form of for-loop given that my number sequence can vary in length every time I run it So I could be populating letters a & b...or many more.

Upvotes: 1

Views: 395

Answers (1)

RHertel
RHertel

Reputation: 23798

Based on the comment by @scoa, I suggest the following modified approach:

series <- c(3, 7, 9)
series <- c(series, nrow(DF)) # This ensures that the sequence extends to the last row of DF
series2 <- c(series[1] ,diff(series))
DF$FLABEL <- rep(letters[1:length(series2)], series2)
#> DF
#      x FLABEL
#1  0.23      a
#2  0.21      a
#3  0.19      a
#4  0.27      b
#5  0.25      b
#6  0.22      b
#7  0.15      b
#8  0.09      c
#9  0.32      c
#10 0.19      d
#11 0.17      d

By using diff() the length of each sequence is calculated based on the index numbers in the input vector series. In this case, the index values 3, 7, 9 are converted into the number of repetitions of subsequent letters up to the last row of the data frame and stored in series2: 3, 4, 2, 2.

data

text <- "x       FLABEL
         0.23     x
         0.21     x
         0.19     x
         0.27     x
         0.25     x
         0.22     x
         0.15     x
         0.09     x
         0.32     x
         0.19     x
         0.17     x"
DF <- read.table(text = text, header=T)

Upvotes: 1

Related Questions