Reputation: 1114
I have data frame of SQL queries I would like to execute on another data frame.
queries <- structure(list(Name = c("innovation", "foos", "testing"), A = c("select * from data WHERE `TEXT` RLIKE '[[:<:]]innovat[^[:space:]]+[[:>:]]'", "select * from data WHERE `TEXT` RLIKE '[[:<:]]foo[^[:space:]]+[[:>:]]'", "select * from data WHERE `TEXT` RLIKE '[[:<:]]test[^[:space:]]+[[:>:]]'"), B = c("", "b", "b"), C = c("c", "c", "c")), .Names = c("Name", "Query", "Q1_2", "Q1_3"), row.names = c(NA, -3L), class = "data.frame")
I would like to loop through these queries using the package sqldf, and name the data frame output from each query its correspond name found in the data frame queries. I also need to create a new variable in each dataframe that matches the data frame name.
Sample dataset
data <- structure(list(Participant = 1:3, A = c("and other foo things", "testing test and foo", "nothing here"), B = c("", "b", "b"), C = c("c", "c", "c")), .Names = c("Participant", "TEXT", "other", "another"), row.names = c(NA, -3L), class = "data.frame")
Something like:
for (i in queries[2]) {
i<- as.data.frame(sqldf(i)
i$category <- i
}
But this doesn't work yet. Suggestions?
Upvotes: 0
Views: 399
Reputation: 107587
Consider iterating by rows, capturing each row's Query and Name to create a dataframe, all compiled into a list of dataframes. You can then output to your global environment (or better yet, keep in list):
dfList <- lapply(seq_len(nrow(queries)), function(i){
df <- sqldf(queries$Query[i])
df$Category <- queries$Name[i]
return(df)
})
names(dfList) <- queries$Name
list2env(dfList, envir = .GlobalEnv)
Alternatively, use mapply's wrapper Map, passing in each Query and Name column vector:
handler <- function(x, y) {
df <- sqldf(x)
df$Category <- y
return(df)
}
dfList <- Map(handler, queries$Query, queries$Name)
names(dfList) <- queries$Name
list2env(dfList, envir = .GlobalEnv)
Upvotes: 0
Reputation: 919
df <- lapply(queries[[2]], sqldf)
names(df) <- queries[[1]]
EDIT: Should work.
Upvotes: 1