jsakaluk
jsakaluk

Reputation: 549

Applying R function by row: data frame problems

I'm having trouble creating a function in R that would allow me to apply a separate function to each row of a data frame, and to save the output of that function back into the data frame.

A simple reproducible example (with the external package/function I want to apply by-row):

library(pwr)

n1 = c(22, 70)
n2 = c(25, 45)
df = data.frame(n1, n2)

What I would like to be able to do is this...:

df$pwr = pwr.t2n.test(n1= df$n1, n2 = df$n2, d = NULL, sig.level = .05, power = .80)[3]

Where I feed in the n1 and n2 columns from my data frame for the functions n1 and n2 arguments. I get a number of unpleasant errors/warnings.

When I try to use adply to apply this function by-row, the same error/warnings occur:

df= adply(df, 1, transform, pwr = pwr.t2n.test(n1= df$n1, n2 = df$n2, d = NULL, sig.level = .05, power = .80)[3])

But, if I apply the pwr() function to one row at a time, specifying the row and column location in the data frame for the n1 and n2 arguments, then I have no problems:

pwr.t2n.test(n1= df[1,1], n2 = df[1,2], d = NULL, sig.level = .05, power = .80)[3] 
= [1] 0.836982

pwr.t2n.test(n1= df[2,1], n2 = df[2,2], d = NULL, sig.level = .05, power = .80)[3]
= [1] 0.5398989

I am wondering if there is some way to use the pwr() function itself, or piggyback on adply or something similar, in order to apply this function within a larger data frame and save the result for each row (given their n1 and n2 arguments).

Upvotes: 0

Views: 127

Answers (2)

zack
zack

Reputation: 5415

A tidyverse version of what @LAP did with base-R:

library(purrr)

map_dfr(transpose(df), function(params){
  list(n1 = params$n1,
       n2 = params$n2,
       pw = pwr.t2n.test(n1 = params$n1, n2 = params$n2, d = NULL, sig.level = 0.05, power = 0.8)$d
  )
})

# A tibble: 2 x 3
     n1    n2    pw
  <dbl> <dbl> <dbl>
1    22    25 0.837
2    70    45 0.540

Just as a heads up - after benchmarking is looks like @LAP's solution is much faster, so use theirs if you're worried about speed.

Edited to address follow up question:

n1 = c(22, 70)
n2 = c(25, 45)
char_vec = c('h', 'i')
df = data.frame(n1, n2, char_vec, stringsAsFactors = FALSE)

map_dfr(transpose(df), function(params){
  # browser()
  list(n1 = params$n1,
       n2 = params$n2,
       pw = pwr.t2n.test(n1 = params$n1, n2 = params$n2, d = NULL, sig.level = 0.05, power = 0.8)$d,
       a_character_vec = params$char_vec
  )
}) 

# A tibble: 2 x 4
     n1    n2    pw a_character_vec
  <dbl> <dbl> <dbl> <chr>          
1    22    25 0.837 h              
2    70    45 0.540 i 

Upvotes: 2

LAP
LAP

Reputation: 6695

You can use indexing to make apply work:

test <- apply(df, 1, function(x){
  pwr.t2n.test(n1 = x[1], n2 = x[2], d = NULL, sig.level = .05, power = .80)
})

[[1]]

     t test power calculation 

             n1 = 22
             n2 = 25
              d = 0.836982
      sig.level = 0.05
          power = 0.8
    alternative = two.sided


[[2]]

     t test power calculation 

             n1 = 70
             n2 = 45
              d = 0.5398989
      sig.level = 0.05
          power = 0.8
    alternative = two.sided

Feed it back from a list with sapply:

df$new <- sapply(test, function(x){
  x$d
})

  n1 n2       new
1 22 25 0.8369820
2 70 45 0.5398989

Upvotes: 2

Related Questions