Reputation: 745
I have this type of dataset
DF
V1 V2 V3
1. A AAA B
2. BBB B CC
3. C BB CCC
And I would like to select the longest string from DF and put it into the new column WINNER like this:
DF
V1 V2 V3 WINNER
1. A AAA B AAA
2. BBB B CC BBB
3. C BB CCC CCC
I have tried
mutate( WINNER = select(which.max (c(nchar(V1), nchar(V2), nchar(V3)))
but it works only for numeric values. I would prefer dplyr solution.
Upvotes: 10
Views: 1506
Reputation: 2164
It's a shame that max/min
in R doesn't have the key
argument like in python, but one can quickly cook up something similar. I would suggest something like this:
library(tidyverse)
df <- read_table(
"
V1 V2 V3
A AAA B
BBB B CC
C BB CCC
"
)
max_key <- function(vars, fn) {
vars[which.max(fn(vars))]
}
df %>%
rowwise() %>%
mutate(
winner = max_key(c_across(V1:V3), str_length)
)
#> # A tibble: 3 x 4
#> # Rowwise:
#> V1 V2 V3 winner
#> <chr> <chr> <chr> <chr>
#> 1 A AAA B AAA
#> 2 BBB B CC BBB
#> 3 C BB CCC CCC
Created on 2020-06-26 by the reprex package (v0.3.0)
Upvotes: 3
Reputation:
You can use c_across()
. What you put in there will control which columns are selected.
library(dplyr)
df %>%
rowwise() %>%
mutate(WINNER = c_across(starts_with("V"))[which.max(nchar(c_across(starts_with("V"))))])
It can be a bit more compact if you want all columns.
df %>%
rowwise() %>%
mutate(WINNER = c_across()[which.max(nchar(c_across()))])
Upvotes: 5
Reputation: 40141
One dplyr
option could be.
df %>%
rowwise() %>%
mutate(WINNER = get(paste0("V", which.max(nchar(c_across(V1:V3))))))
V1 V2 V3 WINNER
<chr> <chr> <chr> <chr>
1 A AAA B AAA
2 BBB B CC BBB
3 C BB CCC CCC
Upvotes: 3
Reputation: 28695
df$winner <-
Reduce(function(x, y) ifelse(nchar(y) > nchar(x), y, x), df)
df
# V1 V2 V3 winner
# 1: A AAA B AAA
# 2: BBB B CC BBB
# 3: C BB CCC CCC
Upvotes: 4
Reputation: 4358
df$winner <- apply(df,1, function(x) x[which.max(nchar(x))])
df
V1 V2 V3 winner
1. A AAA B AAA
2. BBB B CC BBB
3. C BB CCC CCC
Upvotes: 10
Reputation: 5232
In case of of ties winner will be based on first apperance:
df$WINNER <- apply(df, 1, function(row) row[which.max(nchar(row))])
Upvotes: 7