AndyDufresne
AndyDufresne

Reputation: 3

rowwise() not working within function?

I'm new to R, and I'm trying to write a function that will add the entries of a data frame column by row, and return the data frame with

  1. a column of the new row of sums
  2. that column named.

Here's a sample df of my data:

Ethnicity <- c('A', 'B', 'H', 'N', 'O', 'W', 'Unknown')
Texas <- c(2,41,56,1,3,89,7)
Tenn <- c(1,9,2,NA,1,32,3)

When I directly try the following code, the columns are summed by row as desired:

new_df <- df %>% rowwise() %>%
                 mutate(TN_TX = sum(Tenn, Texas, na.rm = TRUE))  

new_df

But when I try to use my function code, rowwise() seems not to work. My function code is:

df.sum.col <- function(df.in, col.1, col.2)  {

if(is.data.frame(df.in) != TRUE){               #warning if first arg not df
  warning('df.in is not a dataframe')}

if(is.numeric(col.1) != TRUE){                
  warning('col.1 is not a numeric vector')}     

if(is.numeric(col.2) != TRUE){
  warning('col.2 is not a numeric vector')}     #warning if col not numeric 


df.out <- rowwise(df.in) %>%
                 mutate(name = sum(col.1, col.2, na.rm = TRUE))

df.out 
}


bad_df <- df.sum(df,Texas, Tenn)

This results in

bad_df

.

I don't understand why the core of the function works outside it but not within. I also tried piping df.in to rowsum() like this:

f.out <- df.in %>% rowwise() %>%
                 mutate(name = sum(col.1, col.2, na.rm = TRUE))

But that doesn't resolve the problem.

As far as naming the new column, I tried doing so by adding the name as an argument, but didn't have any success. Thoughts on this?

Any help appreciated!

Upvotes: 0

Views: 1992

Answers (1)

Nick Kennedy
Nick Kennedy

Reputation: 12640

As suggested by @thelatemail, it's down to non-standard evaluation. rowwise() ha nothing to do with it. You need to rewrite your function to use mutate_. It can be tricky to understand, but here's one version of what you're trying to do:

library(dplyr)
df <- tibble::tribble(
  ~Ethnicity, ~Texas, ~Tenn,
  "A", 2, 1,
  "B", 41, 9,
  "H", 56, 2,
  "N", 1, NA,
  "O", 3, 1,
  "W", 89, 32,
  "Unknown", 7, 3
)

df.sum.col <- function(df.in, col.1, col.2, name)  {

  if(is.data.frame(df.in) != TRUE){               #warning if first arg not df
    warning('df.in is not a dataframe')}

  if(is.numeric(lazyeval::lazy_eval(substitute(col.1), df.in)) != TRUE){                
    warning('col.1 is not a numeric vector')}     

  if(is.numeric(lazyeval::lazy_eval(substitute(col.2), df.in)) != TRUE){
    warning('col.2 is not a numeric vector')}     #warning if col not numeric 

  dots <- setNames(list(lazyeval::interp(~sum(x, y, na.rm = TRUE),
                                         x = substitute(col.1), y = substitute(col.2))),
                   name)

  df.out <- rowwise(df.in) %>%
    mutate_(.dots = dots)

  df.out 
}

In practice, you shouldn't need to use rowwise at all here, but can use rowSums, after selecting only the columns you need to sum.

Upvotes: 1

Related Questions