R: how to apply a function to some columns into groups of data?

Question

I appreciate your help to apply a function to a data frame grouping by some columns. I suppose I have to use some dplyr function or lapply or do.call but I was not able to do that.

I have the following data frame:

dfFull <- data.frame(Cen = c("Cen01", "Cen01", "Cen01", "Cen01",
                             "Cen01", "Cen01", "Cen01", "Cen01", 
                             "Cen02", "Cen02", "Cen02", "Cen02", 
                             "Cen02", "Cen02", "Cen02", "Cen02"), 
                     Model = c("Mod01", "Mod01", "Mod01", "Mod01", 
                               "Mod02", "Mod02", "Mod02", "Mod02",
                               "Mod01", "Mod01", "Mod01", "Mod01",
                               "Mod02", "Mod02", "Mod02", "Mod02"), 
                     Indiv = c(1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4), 
                     PF = c(1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2), 
                     Obj1 = c(0.0,0.02,0.01,0.03,0.01,
                              0.0,0.02,0.0,0.15,0.03, 
                              0.02,0.08,0.1,0.06,0.02,0.09), 
                     Obj2 = c(0.8,0.62,0.85,0.7,0.92,
                              0.26,0.85,0.93,0.03,0.84, 
                              0.94,0.84,0.05,0.63,0.83,0.92))

I have to call a function (from emoa package):

dominated_hypervolume(matrix_points, refp) that calculates the hypervolume for matrix_point using a pre-defined refp.
refp is a vector (RP <- c(1.0,1.0)) used for every calculation.

The problem relies on matrix_points:

matrix_points is a matrix that is transposed compared to my data frame.
I need this hypervolume calculated using Obj1 and Obj2 of all Indiv grouped by Cen, Mod and PF columns.

Using small data, I know dominated_hypervolume will do the work since I would be able to provide the proper data.

I know it is wrong, but I was trying to do something like:

dfFull <- dfFull %>%
  group_by(Cen, Model, PF) %>%
  do.call(HV =dominated_hypervolume(data.matrix(t(dfFull[,5:6]), RP)))

What I expect at the end is the below. HV value is just a example, not the calculated. It is not a problem to repeat the HV value for the lines of the individuals used in its calculation.

Cen     Model   PF   Indiv    Obj1   Obj2    HV
Cen01   Mod01    1     1      0.0    0.8     0.77 
Cen01   Mod01    1     2      0.02   0.62    0.77
Cen01   Mod01    2     3      0.01   0.85    0.74
Cen01   Mod01    2     4      0.03   0.70    0.74
Cen01   Mod02    1     1      0.01   0.92    0.81
Cen01   Mod02    1     2      0.0    0.26    0.81
Cen01   Mod02    2     3      0.02   0.85    0.69
Cen01   Mod02    2     4      0.0    0.93    0.69
Cen02   Mod01    1     1      0.15   0.03    0.88 
Cen02   Mod01    1     2      0.03   0.84    0.88
Cen02   Mod01    2     3      0.02   0.94    0.86
Cen02   Mod01    2     4      0.08   0.84    0.86
Cen02   Mod02    1     1      0.1    0.05    0.76 
Cen02   Mod02    1     2      0.06   0.63    0.76
Cen02   Mod02    2     3      0.02   0.83    0.64
Cen02   Mod02    2     4      0.09   0.92    0.64

Thanks for your help.

AntoniosK · Accepted Answer

library(tidyverse)
library(emoa)

RP <- c(1.0,1.0)

dfFull %>%
  nest(-Cen, -Model, -PF) %>%
  mutate(HV = map_dbl(data, ~dominated_hypervolume(t(data.frame(.x$Obj1, .x$Obj2)), RP))) %>%
  unnest()

#      Cen Model PF     HV Indiv Obj1 Obj2
# 1  Cen01 Mod01  1 0.3764     1 0.00 0.80
# 2  Cen01 Mod01  1 0.3764     2 0.02 0.62
# 3  Cen01 Mod01  2 0.2940     3 0.01 0.85
# 4  Cen01 Mod01  2 0.2940     4 0.03 0.70
# 5  Cen01 Mod02  1 0.7400     1 0.01 0.92
# 6  Cen01 Mod02  1 0.7400     2 0.00 0.26
# 7  Cen01 Mod02  2 0.1484     3 0.02 0.85
# 8  Cen01 Mod02  2 0.1484     4 0.00 0.93
# 9  Cen02 Mod01  1 0.8437     1 0.15 0.03
# 10 Cen02 Mod01  1 0.8437     2 0.03 0.84
# 11 Cen02 Mod01  2 0.1508     3 0.02 0.94
# 12 Cen02 Mod01  2 0.1508     4 0.08 0.84
# 13 Cen02 Mod02  1 0.8698     1 0.10 0.05
# 14 Cen02 Mod02  1 0.8698     2 0.06 0.63
# 15 Cen02 Mod02  2 0.1666     3 0.02 0.83
# 16 Cen02 Mod02  2 0.1666     4 0.09 0.92

R: how to apply a function to some columns into groups of data?

Answers (2)

Related Questions