Reputation: 1536
I would like to return a dataframe with the minimum value of column one based on the values of columns 2-4:
df <- data.frame(one = rnorm(1000),
two = sample(letters, 1000, replace = T),
three = sample(letters, 1000, replace = T),
four = sample(letters, 1000, replace = T))
I can do:
df_group <- df %>%
group_by(two) %>%
filter(one = min(one))
This gets me the lowest value of all the "m's" in column two, but what if column three or four had a lower "m" value in column one?
The output should look like this:
one two
1 -0.311609752 r
2 0.053166742 n
3 1.546485810 a
4 -0.430308725 d
5 -0.145428664 c
6 0.419181639 u
7 0.008881661 i
8 1.223517580 t
9 0.797273157 b
10 0.790565358 v
11 -0.560031797 e
12 -1.546234090 q
13 -1.847945540 l
14 -1.489130228 z
15 -1.203255034 g
16 0.146969892 m
17 -0.552363433 f
18 -0.006234646 w
19 0.982932856 s
20 0.751936728 o
21 0.220751258 h
22 -1.557436228 y
23 -2.034885868 k
24 -0.463354387 j
25 -0.351448850 p
26 1.331365941 x
I don't care which column has the lowest value for a given letter, I just need the lowest value and the letter column.
I'm trying to wrap my head around writing this simplistically. This might be a duplicate, but I didn't know how to word the title and couldn't find any material or previous questions on how to do it.
Upvotes: 1
Views: 251
Reputation: 121568
Another solution based in data.table
:
library(data.table)
setDT(df)
melt(df,
measure=grep("one",names(df),invert = TRUE,value=TRUE))[
,min(one),value]
Upvotes: 2
Reputation: 214957
You can do something like this:
library(dplyr); library(tidyr)
df %>% gather(cols, letts, -one) %>% # gather all letters into one column
group_by(letts) %>%
summarise(one = min(one)) # do a group by summary for each letter
# A tibble: 26 × 2
# letts one
# <chr> <dbl>
#1 a -2.092327
#2 b -2.461102
#3 c -3.055858
#4 d -2.092327
#5 e -2.461102
#6 f -2.249439
#7 g -1.941632
#8 h -2.543310
#9 i -3.055858
#10 j -1.896974
# ... with 16 more rows
Upvotes: 1