Reputation: 25
I have a data set where length and age correspond with individual items (ID #), there are 4 different items, you can see on the data set below.
range(dataset$length)
gives me the overall range of the length for all items. But I need to compare ranges to determine which item (ID #) has the largest range in length relative to the other 3.
length age ID #
3.5 5 1
7 10 1
10 15 1
4 5 2
8 10 2
13 15 2
3 5 3
7 10 3
9 15 3
4 5 4
5 10 4
7 15 4
Upvotes: 1
Views: 446
Reputation: 263451
This gives you the differences in ranges:
lapply( with(dat, tapply(length, ID, range)), diff)
And you can wrap which.max around htat list to give you the ID associated with the largest value:
which.max( lapply( with(dat, tapply(length, ID, range)), diff) )
2
2
Upvotes: 2
Reputation: 52
An easy approach which doesn't use dplyr
, though perhaps less elegant, is the which
function.
range(dataset$length[which(dat$id == 1)])
range(dataset$length[which(dat$id == 2)])
range(dataset$length[which(dat$id == 3)])
range(dataset$length[which(dat$id == 4)])
You could also make a function that gives you the actual range (the difference between the max and the means) and use lapply
to show you the IDs paired with their ranges.
largest_range <- function(id){
rbind(id,
(max(data$length[which(data$id == id)]) -
min(data$length[which(data$id == id)])))
}
lapply(X = unique(data$id), FUN = largest_range)
Upvotes: 0
Reputation: 4226
group_by
in dplyr
may be helpful:
library(dplyr)
dataset %>%
group_by(ID) %>%
summarize(ID_range = n())
The above code is equivalent to the following (it's just written with %>%
):
library(dplyr)
dataset <- group_by(dataset, ID)
summarize(dataset, ID_range = n())
Upvotes: 0
Reputation: 19544
In base R:
mins <- tapply(df$length, df$ID, min)
maxs <- tapply(df$length, df$ID, max)
unique( df$ID)[which.max(maxs-mins)]
Upvotes: 1