d8aninja
d8aninja

Reputation: 3643

How do I add an attribute to an R data.frame while I'm making it with a function?

Let's say I have an R data.frame:

> x <- data.frame()

I also have a SQL query I am building with sprintf():

> (query <- sprintf("select %s from %s %s", "names", "name_table", "where age > 20"))
[1] "select names from name_table where age > 20"

I intend to encase this in in a function in order to fill the data.frame x with the results from query, and just for a few sprinkles on top I want to tell my future self what query was used to generate the data.frame x. I'd like to do this with a call to attr() like so:

> attr(x, "query") <- query
> str(x)
'data.frame':   0 obs. of  0 variables
 - attr(*, "query")= chr "select names from name_table where age > 20"

Because the function is going to look something like

answer_maker <- function(col_names, table_name, condtions) {

                   query <- sprintf("select %s from %s %s", col_names, table_name, conditions)

                   data.frame(sql(query))

    ############## WHAT DO I DO HERE? 
    ############## I want to type something py-like ...self.attr()?
                   attr(self, "query") <- query
               }

Later on I will be able to do the following

> my_first_answer <- answer_maker("names", "name_table", "where age > 20")
> attr(my_first_answer, "query")
[1] "select names from name_table where age > 20"

Upvotes: 2

Views: 4606

Answers (1)

G. Grothendieck
G. Grothendieck

Reputation: 269885

Note that database functions in R typically return a data frame so you don't have to fill an empty existing one. Below we use the sqldf package to keep the example self-contained and reproducible but you can substitute whatever sort of database access you are using. (Typically you will need to create a data base connection and pass it into answer_maker but in this example since we are using sqldf it was not needed.)

library(sqldf)   
name_table <- data.frame(names = letters, age = 1:26) # test data

answer_maker <- function(col_names, table_name, conditions) {
      query <- sprintf("select %s from %s %s", col_names, table_name, conditions)
      ans <- sqldf(query)
      attr(ans, "query") <- query
      ans
}

ans <- answer_maker("names", "name_table", "where age > 20")

giving:

> ans
  names
1     u
2     v
3     w
4     x
5     y
6     z

> attr(ans, "query")
[1] "select names from name_table where age > 20"

Reference Classes Using R's reference classes we can define a class with data and query fields and methods which store the query and run it such that each outputs the object using .self :

Query <- setRefClass("Query", fields = c("data", "query"),
   methods = list(
      setQuery = function(col_names, table_name, conditions) {
          query <<- sprintf("select %s from %s %s", col_names, table_name, conditions)
          .self
      },
      runQuery = function() {
          data <<- sqldf(query)
          .self
      }))

qq <- Query$
        new()$
        setQuery("names", "name_table", "where age > 20")$
        runQuery()

giving:

> qq$data
  names
1     u
2     v
3     w
4     x
5     y
6     z
> qq$query
[1] "select names from name_table where age > 20"

Upvotes: 4

Related Questions