Reputation: 569
I am using readLines(text url) in a script, where readLines(text url) is called several hundred times, where each text url is unique.
After about 125 calls to readLines(text url) I got an error, "all connections are in use."
When I check my open connections with showConnections(all=TRUE), for the url connections I see:
description class ... isopen
"www.site.com" "url" ... "closed" ...
How do I remove these closed connections from the R environment so I can open new connections?
Also, I've tried opening the urls before hand, passing the url connection into readLines, then closing the connection after I'm done with the connection, and still run into the same problem.
Upvotes: 10
Views: 14252
Reputation: 4576
For me Hadley's answer did not work because of two reasons: 1) close
is an S3 generic and has no method for character
objects (maybe it has changed since the answer?); 2) My connection was not in the table returned by showConnections
. It was a curl
type connection left open by reader::read_tsv
after encountering an SSL expired certificate error. If we get the warnings about connections created by third party packages, we have to obtain the connection object to be able to close them. I wrote two little function for this purpose. You see these import more packages than definitely needed, this is because I designed them for a package, but you can easily remove these dependencies.
library(magrittr)
library(tibble)
library(dplyr)
library(purrr)
#' Retrieve the open connection(s) pointing to URI
#'
#' @param uri Character: path or URL the connection points to.
#'
#' @return A list of connection objects.
#'
#' @importFrom magrittr %>%
#' @importFrom tibble rownames_to_column
#' @importFrom dplyr filter pull
#' @importFrom purrr map
#' @noRd
get_connections <- function(uri){
showConnections(all = TRUE) %>%
as.data.frame %>%
rownames_to_column('con_id') %>%
filter(description == uri) %>%
pull(con_id) %>%
as.integer %>%
map(getConnection)
}
#' Closes the open connection(s) pointing to URI
#'
#' @param uri Character: path or URL the connection points to.
#'
#' @return Invisible `NULL`.
#'
#' @importFrom magrittr %>%
#' @importFrom purrr walk
#' @noRd
close_connection <- function(uri){
uri %>%
get_connections %>%
walk(close)
invisible(NULL)
}
Having the functions above, you can do as Hadley has shown:
my_function <- function(my_url, ...) {
on.exit(close_connection(my_url))
}
Upvotes: 2
Reputation: 103898
The easiest way to avoid problems like this is to explicitly close the connection when you're done with it. In R, the easiest way to do that is to use on.exit()
which will ensure the url gets closed even if an error occurs in your code
read_url <- function(url, ...) {
on.exit(close(url))
readLines(url, ...)
}
showConnections()
g <- read_url("http://www.google.com")
showConnections()
Upvotes: 13