Karthik S
Karthik S

Reputation: 11546

How to just keep the minimum value in a row across multiple columns and make all other row values 0 in R

I have a dataframe that has distances between different localities in a city. While grouping based on a condition, some localities are "shared" by others because of which, leading to duplication during calculations. Therefore, In order to correct the duplication, I am trying to calculate the minimum distance across a row and make all other values in that row as 0 so that I can take just the row wise minimum values into my calculations.

Sample data:

> df <- data.frame(name = letters[1:3],
+                  col1 = rnorm(3,10,1),
+                  col2 = rnorm(3,10,1),
+                  col3 = rnorm(3,10,1))
> df
  name      col1      col2     col3
1    a  9.994703 10.882758 9.005535
2    b 11.505343  9.613655 9.589866
3    c 11.713150  9.240391 9.788279
> df$min <- apply(df[,2:4],1,min)
> df
  name      col1      col2     col3      min
1    a  9.994703 10.882758 9.005535 9.005535
2    b 11.505343  9.613655 9.589866 9.589866
3    c 11.713150  9.240391 9.788279 9.240391
> 

Now, I need to make values that are not minimum across the row as 0. Expected Output:

> df
  name      col1      col2     col3      min
1    a       0         0   9.005535 9.005535
2    b       0         0   9.589866 9.589866
3    c       0    9.240391   0      9.240391

Could someone let me know how to go about it.

Upvotes: 2

Views: 1056

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 389315

I guess you don't really need the min column. You could turn the values to 0 in the same apply call by comparing it with the minimum of the row.

df[, 2:4] <- t(apply(df[,2:4],1,function(x) x * +(x == min(x))))
df

#  name     col1 col2     col3
#1    a 9.439524    0 0.000000
#2    b 0.000000    0 8.734939
#3    c 0.000000    0 9.313147

data

set.seed(123)
df <- data.frame(name = letters[1:3],
                 col1 = rnorm(3,10,1),
                 col2 = rnorm(3,10,1),
                 col3 = rnorm(3,10,1))
df
#  name      col1     col2      col3
#1    a  9.439524 10.07051 10.460916
#2    b  9.769823 10.12929  8.734939
#3    c 11.558708 11.71506  9.313147

Upvotes: 1

tmfmnk
tmfmnk

Reputation: 40171

One dplyr and purrr solution could be:

df %>%
 mutate(min_col = pmap(across(starts_with("col")), min),
        across(starts_with("col"), ~ (. == min_col) * .))

  name col1 col2      col3  min_col
1    a    0    0  9.659657 9.659657
2    b    0    0 10.288515 10.28851
3    c    0    0  9.303990  9.30399

Upvotes: 0

Related Questions