Select a string with max length while using Group by in data table in r

Question

My initial sample data was ambiguous so updating my data set

a <- data.table(name=c("?","","One","?","","Two"), value=c(1,3,2,6,5,2) , job=c(1,1,1,2,2,2) )

 name value job
1:    ?     1   1
2:          3   1
3:  One     2   1
4:    ?     6   2
5:          5   2
6:  Two     2   2

I want to group by the column "job" while finding the maximum in column "value" and selecting the "name" which has the maximum length.

My sample output would be

   name job value
1: One    1     3
2: Two    2     6

I think I want the equivalent of How do I select the longest 'string' from a table when grouping in R

akrun · Accepted Answer

We can group by 'job', get the index of the max number of characters (nchar) in 'name' and subset the dataset.

a[, .SD[which.max(nchar(name)) ], by = job]
#    name value job
#1:  One     3   1
#2:  Two     6   2

Or get the row index (.I) from which.max, extract the column with the index ("V1") and subset the dataset.

a[a[, .I[which.max(nchar(name))], by = job]$V1]

Update

Based on the new example, if the 'value' is not corresponding to the maximum number of character in 'name', we need to select it separately.

a[, .(value= max(value), name = name[which.max(nchar(name))]),
                      by = job]
#     job value name
#1:   1     3  One
#2:   2     6  Two

Select a string with max length while using Group by in data table in r

Answers (2)

Update

Related Questions