OscarVGG
OscarVGG

Reputation: 2670

How to import Geojson file to MongoDB

Since Geojson is actual json I thought i could to use mongoimport to load data into my MongoDB database from a .geojson file.

but i'm getting the following error:

exception:BSON representation of supplied JSON is too large: code FailedToParse: FailedToParse: Expecting '{': offset:0

The file is 25MB and this is a fragment of it:

{
"type": "FeatureCollection",
"features": [
{
    "type": "Feature",
    "id": "node/2661561690",
    "properties": {
        "timestamp": "2014-02-08T17:58:24Z",
        "version": "1",
        "changeset": "20451306",
        "user": "Schandlers",
        "uid": "51690",
        "natural": "tree",
        "id": "node/2661561690"
    },
    "geometry": {
        "type": "Point",
        "coordinates": [
            -66.9162255,
            10.5056439
        ]
    }
},
// ... Omitted data
{
    "type": "Feature",
    "id": "node/2664472516",
    "properties": {
        "timestamp": "2014-02-10T04:27:30Z",
        "version": "2",
        "changeset": "20477473",
        "user": "albertoq",
        "uid": "527105",
        "name": "Distribuidora Brithijos (Aceites)",
        "shop": "car_parts",
        "id": "node/2664472516"
    },
    "geometry": {
        "type": "Point",
        "coordinates": [
            -66.9388903,
            10.4833647
        ]
    }
}
]
}

Upvotes: 25

Views: 18233

Answers (6)

FloorDivision
FloorDivision

Reputation: 131

This Python script is designed to import GeoJSON files into MongoDB in one step: https://github.com/rtbigdata/geojson-mongo-import.py

Upvotes: 3

Kartoch
Kartoch

Reputation: 7779

If the problem is your set of documents size is superior to 16Mb, you can use the batchSize option, which set the number of documents in a batch. For instance:

mongoimport -d mydb -c mycol data.json -j 4 --batchSize=100

Note the -j option which helps to increase the output to the database by using several workers.

The batchSize option is strangely not documented using the '--help' option of 'mongoimport', go figure !

Upvotes: 0

hoogw
hoogw

Reputation: 5535

ParoX idea works great, however has 16MB limit.

mongodb document

--jsonArray Accepts the import of data expressed with multiple MongoDB documents within a single JSON array. Limited to imports of 16 MB or smaller.

If the file size larger than 16MB, you could do this

jq --compact-output ".features[]" input.geojson > output.geojson

This will give you exactly one line for one object, no comma at end.

{.....}
{.......}
{...}

{"type":"Feature","geometry":{"type":"Point","coordinates":[-80.87088507656375,35.21515162500578]},"properties":{"name":"ABBOTT NEIGHBORHOOD PARK","address":"1300  SPRUCE ST"}}
{"type":"Feature","geometry":{"type":"Point","coordinates":[-80.83775386582222,35.24980190252168]},"properties":{"name":"DOUBLE OAKS CENTER","address":"1326 WOODWARD AV"}}
{"type":"Feature","geometry":{"type":"Point","coordinates":[-80.83827000459532,35.25674709224663]},"properties":{"name":"DOUBLE OAKS NEIGHBORHOOD PARK","address":"2605  DOUBLE OAKS RD"}}
{"type":"Feature","geometry":{"type":"Point","coordinates":[-80.83697759172735,35.25751734669229]},"properties":{"name":"DOUBLE OAKS POOL","address":"1200 NEWLAND RD"}}
{"type":"Feature","geometry":{"type":"Point","coordinates":[-80.81647652154736,35.40148708491418]},"properties":{"name":"DAVID B. WAYMER FLYING REGIONAL PARK","address":"15401 HOLBROOKS RD"}}
{"type":"Feature","geometry":{"type":"Point","coordinates":[-80.83556459443902,35.39917224760999]},"properties":{"name":"DAVID B. WAYMER COMMUNITY PARK","address":"302 HOLBROOKS RD"}}
{"type":"Feature","geometry":{"type":"Polygon","coordinates":[[[-80.72487831115721,35.26545403190955],[-80.72135925292969,35.26727607954368],[-80.71517944335938,35.26769654625573],[-80.7125186920166,35.27035945142482],[-80.70857048034668,35.268257165144064],[-80.70479393005371,35.268397319259996],[-80.70324897766113,35.26503355355979],[-80.71088790893555,35.2553619492954],[-80.71681022644043,35.2553619492954],[-80.7150936126709,35.26054831539319],[-80.71869850158691,35.26026797976481],[-80.72032928466797,35.26061839914875],[-80.72264671325684,35.26033806376283],[-80.72487831115721,35.26545403190955]]]},"properties":{"name":"Plaza Road Park"}}

mongoimport --db dbname -c collectionname --file "output.geojson" --jsonArray

Upvotes: 4

Menelaos Kotsollaris
Menelaos Kotsollaris

Reputation: 5506

First of all, for verifying that your GeoJSON file is accurate, you could use Geojsonlint, QGIS and so on.

After than, to import your data into your collection, use Mongoimport:

mongoimport --db MY_DATABASE_NAME -c MY_COLLECTION_NAME --type json --file "MY_GEOJSON_FILENAME"

Replace the 3 variables above whith your valid names. Obviously, make sure that your current directory contains the file.

Upvotes: 1

ParoX
ParoX

Reputation: 5941

Download jq (it's sed-like program but for JSON)

Then run:

jq --compact-output ".features" input.geojson > output.geojson

then

mongoimport --db dbname -c collectionname --file "output.geojson" --jsonArray

Upvotes: 25

Adam
Adam

Reputation: 3288

Right now you have an array of features. MongoDB will consider this to be one document. Try deleting the following from the beginning of your geojson:

{
"type": "FeatureCollection",
"features": [

Also, delete the following from the end of your geojson:

]
}

EDIT - Also, mongo expects one document per line. So make sure that your only \n is between documents! e.g.

...    
},\n
    {
        "type": "Feature",
        "id": "node/2664472516",
...

Upvotes: 10

Related Questions