ZT_Geo
ZT_Geo

Reputation: 413

Group by group calculation of new dataframe column with multiple calculations

I have the following data frame.

example_df <- data.frame("Group_name" = c("Group 1", "Group 1", "Group 2", "Group 2", "Group 2"),
                         "Logical_variable" = as.logical(c(F,T,T,F,F)), 
                         "Numeric_variable" = as.numeric(c(1.5e-3, 1, 1, 4e-4, 3e-6)))

From this data frame I want to make a new column example_df$new_col that uses different calculations based on the Logical_variable and does these calculations on a group by group basis.

If the logical is false I need the numeric_value in that row divided by sum of numeric variables that have a false logical value (on a group by group basis) to be returned to the new column. So for Group 1 (first row) it would be 1.5e-3/1.5e-3 resulting in 1. For row 4 (Group 2) you would get 4e-4/(4e-4 + 3e-6) resulting in 0.9925558 and for row 5 3e-6/(3e-6 + 4e-4) resulting in 0.007444169.

For the true logical values I need 1/(1+false logical values for each group) to be returned to the new column. So for row 2 which is in Group 1 it would be 1/(1+1.5e-3) = 0.9985022 and for row 3 it would be 1/(1 + 4e-4 + 3e-6) = 0.9995972.

So the resulting column would be example_df$new_col <- c(1, 0.9985022, 0.9995972, 0.9925558, 0.007444169)

What would be the best way to get to this result? For my actual application there may be many groups and so doing these calculations group by group might be important.

Upvotes: 1

Views: 36

Answers (1)

jblood94
jblood94

Reputation: 17011

With data.table, working the logic as a mathematical expression:

library(data.table)

setDT(example_df)[, new_col := (Logical_variable + (1 - Logical_variable)*Numeric_variable)/(Logical_variable + sum(Numeric_variable[!Logical_variable])), Group_name]

#>    Group_name Logical_variable Numeric_variable     new_col
#> 1:    Group 1            FALSE          1.5e-03 1.000000000
#> 2:    Group 1             TRUE          1.0e+00 0.998502247
#> 3:    Group 2             TRUE          1.0e+00 0.999597162
#> 4:    Group 2            FALSE          4.0e-04 0.992555831
#> 5:    Group 2            FALSE          3.0e-06 0.007444169

Upvotes: 1

Related Questions