lmcshane
lmcshane

Reputation: 1114

R sqldf loop and create data frame named based on query name

I have data frame of SQL queries I would like to execute on another data frame.

queries <- structure(list(Name = c("innovation", "foos", "testing"), A = c("select * from data WHERE `TEXT` RLIKE '[[:<:]]innovat[^[:space:]]+[[:>:]]'", "select * from data WHERE `TEXT` RLIKE '[[:<:]]foo[^[:space:]]+[[:>:]]'", "select * from data WHERE `TEXT` RLIKE '[[:<:]]test[^[:space:]]+[[:>:]]'"), B = c("", "b", "b"), C = c("c", "c", "c")), .Names = c("Name", "Query", "Q1_2", "Q1_3"), row.names = c(NA, -3L), class = "data.frame")

I would like to loop through these queries using the package sqldf, and name the data frame output from each query its correspond name found in the data frame queries. I also need to create a new variable in each dataframe that matches the data frame name.

Sample dataset

data <- structure(list(Participant = 1:3, A = c("and other foo things", "testing test and foo", "nothing here"), B = c("", "b", "b"), C = c("c", "c", "c")), .Names = c("Participant", "TEXT", "other", "another"), row.names = c(NA, -3L), class = "data.frame")

Something like:

for (i in queries[2]) {
  i<- as.data.frame(sqldf(i)
                      i$category <- i

}

But this doesn't work yet. Suggestions?

Upvotes: 0

Views: 399

Answers (2)

Parfait
Parfait

Reputation: 107587

Consider iterating by rows, capturing each row's Query and Name to create a dataframe, all compiled into a list of dataframes. You can then output to your global environment (or better yet, keep in list):

dfList <- lapply(seq_len(nrow(queries)), function(i){
  df <- sqldf(queries$Query[i])
  df$Category <- queries$Name[i]
  return(df)
})    
names(dfList) <- queries$Name

list2env(dfList, envir = .GlobalEnv)

Alternatively, use mapply's wrapper Map, passing in each Query and Name column vector:

handler <- function(x, y) {
    df <- sqldf(x)
    df$Category <- y
    return(df)
}
dfList <- Map(handler, queries$Query, queries$Name)
names(dfList) <- queries$Name

list2env(dfList, envir = .GlobalEnv)

Upvotes: 0

seasmith
seasmith

Reputation: 919

df <- lapply(queries[[2]], sqldf) names(df) <- queries[[1]]

EDIT: Should work.

Upvotes: 1

Related Questions