Michael
Michael

Reputation: 63

Writing a function to add rows to a data frame

I'm trying to write a function that will automatically add an empty row to the end of a data frame and assign the resulting data frame to the original name.

As an example, I've created the empty data frame data using:

data <- data.frame(id = integer(0), name = character(0))

I can add a row to data using single-bracket subsetting to assign NAs to all variables for the new row:

data[nrow(data) + 1, 1:ncol(data)] <- NA

This returns the same data frame with an additional row of NAs:

> data
  id name
1 NA <NA>   

Running it twice proves the code-snippet is functioning:

> data <- data.frame(id = integer(0), name = character(0))
> data[nrow(data) + 1, 1:ncol(data)] <- NA
> data[nrow(data) + 1, 1:ncol(data)] <- NA
> data
  id name
1 NA <NA>
2 NA <NA>

The problem arises when I try to wrap this code in a function:

add_row <- function(df) {
df[nrow(df) + 1, 1:ncol(df)] <- NA
}

Calling add_row() returns no errors, but does not add a new row to the data frame:

> add_row(data)
> data
[1] id   name
<0 rows> (or 0-length row.names)

Clearly I'm missing something, but I'm not sure what it could be. Any help is greatly appreciated!

Upvotes: 1

Views: 397

Answers (2)

G. Grothendieck
G. Grothendieck

Reputation: 269451

A. Functional approach

Return df and then assign it to a new data frame or overwrite the existing one:

add_row <- function(df) {
  df[nrow(df) + 1, 1:ncol(df)] <- NA
  df
}

data <- add_row(data)
# or
data2 <- add_row(data)

B. In place

1. Pass name and environment

You can overwrite it from within the function but it's not the functional style that is generally used with R which emphasizes side effect free processing.

add_row_name <- function(df, envir = parent.frame()) {
  dfx <- envir[[df]]
  dfx[nrow(dfx) + 1, 1:ncol(dfx)] <- NA
  envir[[df]] <- dfx
  invisible(dfx)
}

add_row_name("data")

2. Pass formula

or specify the name using a formula:

add_row_fo <- function(formula, envir = environment(formula)) {
    add_row_name(all.vars(formula), envir)
}

add_row_fo(~ data)

3. Non-standard evaluation

Another possibility is to use non-standard evaluation:

add_row_ns <- function(df, envir = parent.frame()) {
  nm <- deparse(substitute(df))
  dfx <- envir[[nm]]
  dfx[nrow(dfx) + 1, 1:ncol(dfx)] <- NA
  envir[[nm]] <- dfx
  invisible(dfx)
}

add_row_ns(data)

C. rbind

Above we based the code on that in the question but note that

rbind(data, NA)

would be sufficient to add an NA row provided you assign that back to data or to a new name so maybe you don't need add_row in the first place.

Update

Fixed. Added additional alternatives.

Upvotes: 3

SmitM
SmitM

Reputation: 1376

You need to slightly modify your code as follows:

add_row <- function(df) {
  df[nrow(df) + 1, 1:ncol(df)] <- NA
  return(df)
}

data <- add_row(data)

Upvotes: 2

Related Questions