Nikhil Chilwant
Nikhil Chilwant

Reputation: 649

Which is the suitable database for storing a large JSON?

I have only one large JSON file. For example,

{   
    "Name": "Motor_M23",
    "AASID": {
        "IDType": "URI",
        "IDSpec": "http://acplt.org/AAS/Motor_M23"
    },
    "AssetID": {
        "IDType": "URI",
        "IDSpec": "http://acplt.org/Assets/Motor_M23"
    },
    "Header": {
        "PropertyValueStatementContainers": [
            {
                "Name": "Config",
                        .
                        .
                        .
                        .

I need to support operations following operations:

  1. Querying for an element should return all child elements e.g. Querying for AssetID should return

    "AssetID": {
        "IDType": "URI",
        "IDSpec": "http://acplt.org/Assets/Motor_M23"
    }
    
  2. Update value of elements.

  3. Delete elements.
  4. Move elements between hierarchy levels e.g. make AssetID child element of AASID.

I considered following approaches:

  1. Graph database : I started reading about Neo4J. However, it can not create graph from JSON intellgently. One has to specify node type and their hierarchy order.
  2. ElasticSearch : It can work by treating JSON as text and hence not efficient solution.
  3. Postgres : It supports querying over JSON objects but updating, deletions won't be efficient.

Is there any good database out there which can load data from large JSON and handle my operations?

Upvotes: 9

Views: 15639

Answers (4)

Benjamin Bryant
Benjamin Bryant

Reputation: 31

Couchbase’ SQL-like N1QL does deep JSON traversal and manipulation. It can also index nested array elements and UNNEST arrays at query time. It will do what you are asking.

Upvotes: 0

Bharanidharan K
Bharanidharan K

Reputation: 699

This is a typical architectural question to choose the right database, wherein you have to consider quite a few important aspects such as HA, resiliency, replication, sharding, tools support, maturity, licensing, backup & restore etc.

MongoDB and Couchbase DB are the two most popular and widely used document databases. There is no straight-forward answer to choose one, as you have to do trade-off analysis. I can share my two cents, hopefully this would help you in arriving at the right decision.

Either MongoDB or Couchbase NoSQL document databases can be considered, as json is the first class citizen in both and you get really good options to perform operations using fields.

  1. MongoDB (CP support out of CAP) prefers consistency over availability whereas couchbase (AP out of CAP) is high available database.
  2. MongoDB cluster works with master/slave architecture whereas couchbase cluster works with peer-to-peer distribution architecture.

There are many more dimensions to be considered and following links would take you in right direction.

https://suyati.com/blog/mongodb-vs-couchbase/

https://www.couchbase.com/comparing-couchbase-vs-mongodb

Since, in your particular case you have highlighted that you have only one large file, IMDG (in-memory data grid such as Apache Ignite) based solutions can also be considered with a single node set up.

Upvotes: 3

user3788685
user3788685

Reputation: 3113

If you are only working with JSON then you should really use a document oriented database as it will save you having to wrestle something sql related.

MongoDB is a good choice, supports many drivers and can deal with tree structures (Though I'm not sure about the automatic creation)

CRUD operations are simple and cover a wide range of cases.

For very large datasets on busy servers you should use the XFS file system and the WiredTiger storage engine as there are some gains in performance.

It's well supported, and isn't that much of a learning curve. (I came from Pure SQL without too much trouble)

You also have the option of MariaDB or MySQL which also both support JSON though I have no experience with either, and in the case of MySQL I feel it was just a 'bolt on' which had to be added in the face of an up-coming requirement.

Upvotes: 3

TopReseller
TopReseller

Reputation: 666

You can programmatically parse and save data in SQL tables depending on operation logic you need:

  1. Keep all children (with all contents) as text in 1 table, per top element.

Element | Children (text)

"AssetID" |  {
        "IDType": "URI",
        "IDSpec": "http://acplt.org/Assets/Motor_M23"
    },
  1. Keep elements in a table, and first level of children (with all contents) separately per top element.

ID | Element

Element ID | Child (text)

1 | "AssetID"

1 | "IDType": "URI"
1 | "IDSpec": "http://acplt.org/Assets/Motor_M23"
  1. Keep elements (names) in a table, id and parent id to parse on unlimited levels, value if no children. This way you can query all elements (at any level) and content, update as necessary. You can also parse from any element based on ids and parent ids.

ID | Element | Parent | Value

1  | "Name"   | 0 | "Motor_M23"
2  | "AASID"  | 0 | - 
3  | "IDType" | 2 | "URI",
4  | "IDSpec" | 2 | "http://acplt.org/AAS/Motor_M23"
5  | "AssetID"| 0 | -
6  | "IDType" | 5 | "URI",
7  | "IDSpec" | 5 | "http://acplt.org/Assets/Motor_M23"
8  | "Header" | 0 | -
9  | "PropertyValueStatementContainers" | 8 | -
10 | "Name"   | 9 | "Config"

Upvotes: -1

Related Questions