Keep the row if the specific column is the minimum value of that row

Question

I cannot share the dataset but I will explain it as best as I can. The dataset has 50 columns 48 of them are in Y/m/d h:m:s format. also the data has many NA, but it must not be removed.

Let's say there is a column B. I want to remove the rows if the value of B is not the earliest in that row.

How can I do this in R? For example, the original would be like this:

df <- data.frame(
  A = c(11,19,17,6,13),
  B = c(18,9,5,16,12),
  C = c(14,15,8,87,16))

   A  B  C
1 11 18 14
2 19  9 15
3 17  5  8
4  6 16 87
5 13 12 16

but I want this:

Darren Tsai · Accepted Answer

You could use apply() to find the minimum for each row.

df |> subset(B == apply(df, 1, min, na.rm = TRUE))

#    A  B  C
# 2 19  9 15
# 3 17  5  8
# 5 13 12 16

The tidyverse equivalent is

library(tidyverse)

df %>% filter(B == pmap(across(A:C), min, na.rm = TRUE))

Keep the row if the specific column is the minimum value of that row

Answers (2)

Related Questions