Reputation: 111
What is the best way to add (spatial) data to a GeoMesa/Accumulo stack?
(1) If I understand correctly, a SimpleFeature creation file and converter file should be created in order to add the data. The data itself is stored as CSV. Am I correct that we must build these files for each CSV we wish to add?
(2) Are the examples below correct? For example, the geometry in the CSV files is stored as follows. " MULTILINESTRING((2.0116069 48.9172785,2.0116474 48.9172131,2.0117161 48.917135,2.011814 48.9170714,2.0118996 48.9170489))"
(3) How do we add these converter files to the process of adding the data to the GeoMesa/Accumulo stack?
The goal in the end is to have a (simple) procedure to add data to the stack and, in a next step, to open the data through a Geoserver.
Any kind of help is welcome. Thanks in advance.
Simple feature creation file:
geomesa.sfts.links_geom = {
attributes = [
{ name = "id", type = "Long" }
{ name = "length", type = "Float" }
{ name = "number", type = "Integer" }
...
{ name = "geom", type = "MultiLineString", srid = 4326 }
]
} ```
Converter file:
geomesa.converters.links_geom = {
type = "delimited-text",
format = "CSV",
id-field = "toString($id)",
fields = [
{ name = "id", transform = "$1::long" }
{ name = "length", transform = "$2::float" }
{ name = "number", transform = "$3::int" }
...
{ name = "geom", transform = "multilinestring($11)" }
]
}
Upvotes: 0
Views: 284
Reputation: 1634
There is no "best" way to ingest data into GeoMesa, it depends on your specific use-case. The command-line tools provide an easy entry point, but more advanced scenarios might use Apache NiFi, a stream processing framework like Apache Storm, or cloud native tools like AWS Lambda.
GeoMesa is a GeoTools data store, so you can write data using the DataStore API, without any Converter definitions. There are examples of this in the geomesa-tutorials project. However, Converters provide a declarative way to define your data type without any code. They can also be re-used across environments, so if you develop a Converter for the CLI tools, you can easily use the same definition in e.g. Apache NiFi, allowing you to scale and migrate your ingest as needed.
In general, with Converters you do need to define one per file format. GeoMesa offers type inference for CSV files as described here, which may let you ingest your data without a converter, or at least provide an initial template that you can tweak to your needs. There is information on adding your Converters to the classpath here and here.
When developing an initial Converter definition, it can be helpful to use the convert
CLI command, with the error mode to "raise-errors" as described here. Once your definition is solid, you can then proceed with ingestion.
Upvotes: 1