Reputation: 770
My initial sample data was ambiguous so updating my data set
a <- data.table(name=c("?","","One","?","","Two"), value=c(1,3,2,6,5,2) , job=c(1,1,1,2,2,2) )
name value job
1: ? 1 1
2: 3 1
3: One 2 1
4: ? 6 2
5: 5 2
6: Two 2 2
I want to group by the column "job" while finding the maximum in column "value" and selecting the "name" which has the maximum length.
My sample output would be
name job value
1: One 1 3
2: Two 2 6
I think I want the equivalent of How do I select the longest 'string' from a table when grouping in R
Upvotes: 2
Views: 2296
Reputation: 7443
I'm not sure you want a dplyr solution but here is one:
library(dplyr)
a %>% group_by(job) %>% slice(which.max(nchar(as.character(name))))
name value job
(fctr) (dbl) (dbl)
1 One 3 1
2 Two 6 2
Upvotes: 2
Reputation: 887213
We can group by 'job', get the index of the max
number of characters (nchar
) in 'name' and subset the dataset.
a[, .SD[which.max(nchar(name)) ], by = job]
# name value job
#1: One 3 1
#2: Two 6 2
Or get the row index (.I
) from which.max
, extract the column with the index ("V1") and subset the dataset.
a[a[, .I[which.max(nchar(name))], by = job]$V1]
Based on the new example, if the 'value' is not corresponding to the maximum number of character in 'name', we need to select it separately.
a[, .(value= max(value), name = name[which.max(nchar(name))]),
by = job]
# job value name
#1: 1 3 One
#2: 2 6 Two
Upvotes: 3