Reputation: 61
I am looking to an R solution to this problem. My list of parameters is over 18000 long, so I attempted to split this up into a for-loop to run the query during each iteration with 2000 parameters (except the last iteration, which may have less than 2000). However, it seems to be "storing" parameters somewhere during each iteration, so after the first iteration it tells me I hit the limit. If I break it up into chunks of 1000, it breaks down after the second iteration. My code looks like:
Start_List<-(some list of values)
for (i in 1:ceiling(length(List)/2000)) {
#Partition List into chunks of length 2000
List<-Start_List[2000*(i-1)+1:min(2000*i,length(Start_List))]
#Create qmarks for List
qmarks_List <- paste(rep("?", length(List)), collapse = ",")
#Query
query <- paste("
SELECT columns
FROM table
WHERE column IN (", qmarks_List, ")
")
loop_df <- dbGetQuery(db, query, params= c(as.list(List)))
#Store the query in a list
query_list[[i]]<-loop_df
}
How can I clear the parameters so it starts back at 0 parameters each iteration?
Update: 8/24/2022 still looking for a solution to this problem.
Upvotes: 0
Views: 252
Reputation: 160437
While I cannot reproduce this bug, I can present an alternative: upload your value to a temporary table, query against it (either set-membership or left-join), then delete it. The reason to use bound-parameters in the first place is to prevent SQL injection (malicious or accidental corruption), and I believe this suggestion preserves that intent.
DBI::dbWriteTable(con, "#sometable", data.frame(val = Start_List), create = TRUE)
## choose one from:
DBI::dbGetQuery(con, "select Columns from table where column in (select val from #sometable)")
DBI::dbGetQuery(con, "select t2.Columns from #sometable t1 left join table t2 on t1.val=t2.column")
## cleanup, though the temp table will auto-delete when you disconnect
DBI::dbExecute(con, "drop table #sometable")
While this should fix the problem you're having, it also should simplify and speed-up your process: instead of iterating over groups of your 18K-long list, it does a single query, single data-pull, to retrieve all of the records. If you still need them grouped afterwards, that can be done easily in R (perhaps more easily than in SQL, but I'm confident a SQL-guru could demonstrate a safe/fast/efficient SQL method for this as well).
If you aren't aware of temp tables in SQL Server: prepending with #
makes it a per-connection temporary table, meaning that no other connected user (even same user, different connection) will see this table; prepending with ##
makes it a "global" temporary table, meaning that the same user on any connection will see the table. Both types are automatically dropped when this connection closes.
Upvotes: 1