Reputation: 315
I have an interesting and what I think should be a simple problem. The problem is how to assign a categorical variable based upon the numerical or date order in another column.
The data is sample point data over time. The same points have been measured multiple times over the course of a number of years. I want to assign the values T1, T2, T3 etc for each sample point, with T1 the first measurement, T2 the second and so on for each point.
If the data is for example:
df <- data.frame(Point = factor(c("A", "A", "B", "B", "C", "D", "E", "E", "E")),
Date = c("20140404", "20161002", "20150217", "20170101", "20130508",
"20130514", "20131024", "20150412", "20170210"),
Data = c(10, 5, 5, 3, 2, 7, 8, 5, 6))
The data frame would look like:
Point Date Data
1 A 20140404 10
2 A 20161002 5
3 B 20150217 5
4 B 20170101 3
5 C 20130508 2
6 D 20130514 7
7 E 20131024 8
8 E 20150412 5
9 E 20170210 6
And the end result would be:
Point Date Data Time
1 A 20140404 10 T1
2 A 20161002 5 T2
3 B 20150217 5 T1
4 B 20170101 3 T2
5 C 20130508 2 T1
6 D 20130514 7 T1
7 E 20131024 8 T2
8 E 20150412 5 T3
9 E 20170210 6 T1
I'm sure this can be accomplished using a for loop, where:
for (i in df$Point {
df$Time <-
}
But I get stuck at how to get R to add T1 for the lowest df$Date, T2 for the next and so on.
Any help appreciated.
Upvotes: 1
Views: 580
Reputation: 606
You could do:
df$Time <- paste0("T", ave(df$Data, df$Point, FUN=seq_along))
Output:
print(df)
Point Date Data Time
1 A 20140404 10 T1
2 A 20161002 5 T2
3 B 20150217 5 T1
4 B 20170101 3 T2
5 C 20130508 2 T1
6 D 20130514 7 T1
7 E 20131024 8 T1
8 E 20150412 5 T2
9 E 20170210 6 T3
Assuming that you Date column is sorted (like what you showed in your example).
The ave
function groups a FUN (is this case seq_along
) over level combinations of factors.
The seq_along
generates regular sequences.
For more info, see the R help documentation page by doing:
?ave
?seq_along
Upvotes: 1