Reputation: 3403
I want to make an SQLite database from a list of data.frame(s) using the dplyr package. It looks like dplyr::copy_to
is what I need. I think the problem is related to NSE. Also see the vignette on dplyr with databases.
data(iris)
data(cars)
res <- list("iris" = iris, "cars" = cars)
my_db <- dplyr::src_sqlite(paste0(tempdir(), "/foobar.sqlite3"),
create = TRUE)
lapply(res, function(x) dplyr::copy_to(my_db, x))
Error: Table x already exists.
Upvotes: 0
Views: 763
Reputation: 9310
If someone prefers a {purrr}
option:
imap(res, ~dplyr::copy_to(dest = my_db,
df = .x,
name = .y,
overwrite = TRUE))
or
imap(res, ~dbWriteTable(conn = my_db, .y,
value = .x,
overwrite = TRUE))
Upvotes: 0
Reputation: 1427
The reason for this is because the default table name is based off of the name of the data frame in R. When using lapply
, it does not take the index name.
The documentation for dplyr::copy_to.src_sql
contains:
## S3 method for class 'src_sql'
copy_to(dest, df, name = deparse(substitute(df)),
types = NULL, temporary = TRUE, unique_indexes = NULL, indexes = NULL,
analyze = TRUE, ...)
The line name = deparse(substitute(df))
shows where the table name comes from.
We can see what that turns into:
res <- list("iris" = iris, "cars" = cars)
tmp = lapply(res, function(x) print(deparse(substitute(x))))
#> [1] "X[[i]]"
#> [1] "X[[i]]"
The name
for the table in the SQLite
source is X[[i]]
; once the first line has been executed, the table already exists.
We can resolve by either using an explicit for
loop and passing the name secondarily, or by using lapply
on an index number.
For instance:
res <- list("iris" = iris, "cars" = cars)
my_db <- dplyr::src_sqlite(paste0(tempdir(), "/foobar.sqlite3"),
create = TRUE)
lapply(seq_along(res), function(i, l){dplyr::copy_to(my_db, l[[i]], names(l)[[i]])}, l = res)
my_db %>% tbl("iris") %>% head
#> Source: query [?? x 5]
#> Database: sqlite 3.8.6
#>
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 5.1 3.5 1.4 0.2 setosa
#> 2 4.9 3.0 1.4 0.2 setosa
#> 3 4.7 3.2 1.3 0.2 setosa
#> 4 4.6 3.1 1.5 0.2 setosa
#> 5 5.0 3.6 1.4 0.2 setosa
#> 6 5.4 3.9 1.7 0.4 setosa
my_db %>% tbl("cars") %>% head
#> Source: query [?? x 2]
#> Database: sqlite 3.8.6
#>
#> speed dist
#> <dbl> <dbl>
#> 1 4 2
#> 2 4 10
#> 3 7 4
#> 4 7 22
#> 5 8 16
#> 6 9 10
Upvotes: 1