Reputation: 5580
Working with sf
objects in R
, as well as tables in MariaDB
with geometry
(in this case point
) columns, I'm struggling to efficiently move data between the two (sf
object to MariaDB
table and vice-versa).
Note I'm using the RMariaDB
package to connect to MariaDB
, and have defined my connection here as consdb
.
Example data:
library(sf)
pnt <- data.frame( name = c("first", "second"),
lon = c(145, 146),
lat = c(-38, -39) )
pnt <- st_as_sf( pnt, coords = c("lon", "lat") )
Trying to write sf object directly
Ideally, I'd like to be able to write sf objects like this directly to MariaDB
using dbWriteTable
or dbAppendTable
. At the moment, that gives what look like compatibility errors.
If I try with dbWriteTable
:
dbWriteTable(consdb, "temp", pnt, temporary=TRUE, overwrite=TRUE)
# Error in result_bind(res@ptr, params) : Cannot get geometry object from data you send to the GEOMETRY field [1416]
Or by creating the table first:
dbExecute(consdb, "CREATE OR REPLACE TEMPORARY TABLE temp (name VARCHAR(10), geometry POINT)")
dbAppendTable(consdb, "temp", pnt)
# Error in result_bind(res@ptr, params) : Unsupported column type list
Trying to convert to point type on insert
If I were inserting with a SQL insert query, I'd use PointFromText
like so
INSERT INTO temp (name, geometry) VALUES ('new point', PointFromText('POINT(145 38)', 4326));
So I tried using that to send the data as a string. I wrote a couple of functions to convert the sf
geometry column into an appropriate string column:
# to convert 1 value
point_to_text <- function(x, srid = 4326) {
sprintf("PointFromText('POINT(%f %f)', %i)", x[1], x[2], srid)
}
# to apply the above over a whole column
points_to_text <- function(x, srid = 4326) {
vapply(x, point_to_text, srid = srid, NA_character_)
}
Used that to turn the sf
object into a data.frame
for_sql <- data.frame(pnt)
for_sql$geometry <- points_to_text(for_sql$geometry)
The geometry column is now a character column like: PointFromText('POINT(145.000000 -38.000000)', 4326)
Using dbWriteTable
would just create a text column so I try creating the table, then using dbAppendTable
:
dbExecute(consdb, "CREATE OR REPLACE TEMPORARY TABLE temp (name VARCHAR(10), geometry POINT)")
dbAppendTable(consdb, "temp", pnt)
# Error in result_bind(res@ptr, params) : Cannot get geometry object from data you send to the GEOMETRY field [1416]
Something that works, but seems silly
I can get this to work if I create a temporary SQL table, change the column to text, insert the data from R, convert the column in SQL, then append that to the original SQL table. It seems ridiculously convoluted, but just to show that it works:
# create temporary table
dbExecute(consdb, "CREATE OR REPLACE TEMPORARY TABLE temp_geom LIKE temp")
# change the geometry column to text
dbExecute(consdb, "ALTER TABLE temp_geom MODIFY COLUMN geometry TEXT")
# add the data to the temporary table
dbAppendTable(consdb, "temp_geom", for_sql)
# add a new point column
dbExecute(consdb, "ALTER TABLE temp_geom ADD COLUMN geom_conv POINT")
# convert strings to points
dbExecute(consdb, "UPDATE temp_geom SET geom_conv = PointFromText(geometry, 4326)")
# drop the old column and replace it with the new one
dbExecute(consdb, "ALTER TABLE temp_geom DROP COLUMN geometry")
dbExecute(consdb, "ALTER TABLE temp_geom CHANGE COLUMN geom_conv geometry POINT")
# append the data from the temporary table to the main one
dbExecute(consdb, "INSERT INTO temp SELECT * FROM temp_geom")
Are there any solutions others use for this, or anything that might solve the problem of passing data between sf
objects and MariaDB
tables?
EDIT TO ADD: As per comment from @SymbolixAU, I've now tried the following
st_write(
obj=pnt, # the sf class object, as created above
dsn=consdb, # the MariaDB connection
layer="temp", # the table name on MariaDB
append=TRUE,
layer_options=c('OVERWRITE=false', 'APPEND=true')
)
# Error in result_bind(res@ptr, params) :
Cannot get geometry object from data you send to the GEOMETRY field [1416]
Upvotes: 2
Views: 606
Reputation: 5580
I've come up with a bit of a hacky solution to this. It's not ideal, but I think it might be good enough.
Since the sf
package can convert a geometry column to a WKT string with st_as_text
, and MariaDB
can do the reverse with ST_GeomFromText
, I can use those to get things working.
One problem is that the function call I want to pass to MariaDB
(something like ST_GeomFromText('POINT(1 2)')
can't be passed to any of the usual table write functions like dbAppendTable
because they convert the function call to a text string (I assume by enquoting it), so I have to create my own insert query and call it with dbExecute
.
Here's the function I've come up with, the intention of which is to play the role of dbAppendTable
, when the object to write is an sf
object.
sf_dbAppendTable <- function(conn, name, value, srid = 4326) {
# convert the geometry columns to MariaDB function calls
sfc_cols <- vapply(value, inherits, NA, "sfc")
for(col in which(sfc_cols)) {
value[[col]] <- sprintf(
"ST_GeomFromText('%s', %i)",
sf::st_as_text( value[[col]] ),
srid
)
}
# when inserting to sql, surround some values in quotes, except a few types
# specifically exclude the geometry columns from this
cols_to_enquote <- vapply(value, function(x) {
if (inherits(x, "logical")) return( FALSE )
if (inherits(x, "integer")) return( FALSE )
if (inherits(x, "double")) return( FALSE )
return( TRUE )
}, NA) & !sfc_cols
# set aside column names
col_names <- names(value)
# convert to a matrix
value <- as.matrix(value)
# it should be character
if (typeof(value) != "character") value <- as.character(value)
# enquote the columns that need it, except for `NA` values, replace with `NULL`
value[ , which(cols_to_enquote) ] <- ifelse(
is.na(value[ , which(cols_to_enquote) ]),
"NULL",
paste0("'", value[ , which(cols_to_enquote) ], "'")
)
# any `NA` values still remaining, also replace with `NULL`
value[ is.na(value) ] <- "NULL"
# create a single insert query
sql_query <- sprintf(
"INSERT INTO %s (%s) VALUES (%s);",
name,
paste(col_names, collapse = ","),
paste(apply(value, 1, paste, collapse = ","), collapse = "),(")
)
# execute the query
dbExecute(conn, sql_query)
}
This seems to be working for me, but I'm sure it's nowhere near as robust or efficient as something like dbAppendTable
would be. For one thing I'm using a single query string which won't work well for large queries, and won't be as efficient as the LOAD DATA INFILE
method some packages manage to leverage.
If anyone has a better solution, I'd still love to hear it.
Upvotes: 1