Reputation: 11546
I have a dataframe that has distances between different localities in a city. While grouping based on a condition, some localities are "shared" by others because of which, leading to duplication during calculations. Therefore, In order to correct the duplication, I am trying to calculate the minimum distance across a row and make all other values in that row as 0 so that I can take just the row wise minimum values into my calculations.
Sample data:
> df <- data.frame(name = letters[1:3],
+ col1 = rnorm(3,10,1),
+ col2 = rnorm(3,10,1),
+ col3 = rnorm(3,10,1))
> df
name col1 col2 col3
1 a 9.994703 10.882758 9.005535
2 b 11.505343 9.613655 9.589866
3 c 11.713150 9.240391 9.788279
> df$min <- apply(df[,2:4],1,min)
> df
name col1 col2 col3 min
1 a 9.994703 10.882758 9.005535 9.005535
2 b 11.505343 9.613655 9.589866 9.589866
3 c 11.713150 9.240391 9.788279 9.240391
>
Now, I need to make values that are not minimum across the row as 0. Expected Output:
> df
name col1 col2 col3 min
1 a 0 0 9.005535 9.005535
2 b 0 0 9.589866 9.589866
3 c 0 9.240391 0 9.240391
Could someone let me know how to go about it.
Upvotes: 2
Views: 1056
Reputation: 389315
I guess you don't really need the min
column. You could turn the values to 0 in the same apply
call by comparing it with the minimum of the row.
df[, 2:4] <- t(apply(df[,2:4],1,function(x) x * +(x == min(x))))
df
# name col1 col2 col3
#1 a 9.439524 0 0.000000
#2 b 0.000000 0 8.734939
#3 c 0.000000 0 9.313147
data
set.seed(123)
df <- data.frame(name = letters[1:3],
col1 = rnorm(3,10,1),
col2 = rnorm(3,10,1),
col3 = rnorm(3,10,1))
df
# name col1 col2 col3
#1 a 9.439524 10.07051 10.460916
#2 b 9.769823 10.12929 8.734939
#3 c 11.558708 11.71506 9.313147
Upvotes: 1
Reputation: 40171
One dplyr
and purrr
solution could be:
df %>%
mutate(min_col = pmap(across(starts_with("col")), min),
across(starts_with("col"), ~ (. == min_col) * .))
name col1 col2 col3 min_col
1 a 0 0 9.659657 9.659657
2 b 0 0 10.288515 10.28851
3 c 0 0 9.303990 9.30399
Upvotes: 0