Alison Bennett
Alison Bennett

Reputation: 315

Use loop to assign category based on date order in R

I have an interesting and what I think should be a simple problem. The problem is how to assign a categorical variable based upon the numerical or date order in another column.

The data is sample point data over time. The same points have been measured multiple times over the course of a number of years. I want to assign the values T1, T2, T3 etc for each sample point, with T1 the first measurement, T2 the second and so on for each point.

If the data is for example:

df <- data.frame(Point = factor(c("A", "A", "B", "B", "C", "D", "E", "E", "E")), 
                            Date = c("20140404", "20161002", "20150217", "20170101", "20130508",
                                     "20130514", "20131024", "20150412", "20170210"),
                            Data = c(10, 5, 5, 3, 2, 7, 8, 5, 6))

The data frame would look like:

   Point     Date Data
1      A 20140404   10
2      A 20161002    5
3      B 20150217    5
4      B 20170101    3
5      C 20130508    2
6      D 20130514    7
7      E 20131024    8
8      E 20150412    5
9      E 20170210    6

And the end result would be:

  Point     Date Data  Time
1      A 20140404   10  T1
2      A 20161002    5  T2
3      B 20150217    5  T1
4      B 20170101    3  T2
5      C 20130508    2  T1
6      D 20130514    7  T1
7      E 20131024    8  T2
8      E 20150412    5  T3
9      E 20170210    6  T1

I'm sure this can be accomplished using a for loop, where:

for (i in df$Point {
df$Time <- 
}

But I get stuck at how to get R to add T1 for the lowest df$Date, T2 for the next and so on.

Any help appreciated.

Upvotes: 1

Views: 580

Answers (1)

lizzie
lizzie

Reputation: 606

You could do:

df$Time <- paste0("T", ave(df$Data, df$Point, FUN=seq_along))

Output:

print(df)

  Point     Date Data Time 
1     A 20140404   10   T1
2     A 20161002    5   T2
3     B 20150217    5   T1
4     B 20170101    3   T2
5     C 20130508    2   T1
6     D 20130514    7   T1
7     E 20131024    8   T1
8     E 20150412    5   T2
9     E 20170210    6   T3

Assuming that you Date column is sorted (like what you showed in your example).

The ave function groups a FUN (is this case seq_along) over level combinations of factors. The seq_along generates regular sequences.

For more info, see the R help documentation page by doing:

  • ?ave
  • ?seq_along

Upvotes: 1

Related Questions