Marc
Marc

Reputation: 303

Retrieve a JSON object from JSON array using Cloudant

I am doing an API call every 40 mins to retrieve the current status information of every car in a car fleet. And each call adds one new JSON document to a Cloudant database. Each JSON document defines the current availability status for every car across many locations in many cities. There are currently around 2200 JSON documents in the database. All JSON documents have one field called payload that contains all information; it is a large array of objects. Instead of retrieving the whole payload array of objects I would like to retrieve only the needed info with a query (so, only one or several objects of that array). However, I have difficulty drafting a query that results only in the needed data.

Below, I'll explain my problem in more detail: When saving the JSON document to Cloudant, a timestamp is defined in the document. The _id parameter is defined to be equal to this timestamp. Below, I show a simplified version of these JSON documents:

 {
  "_id": "1540914946026",
  "_rev": "3-c1834c8a230cf772e41bbcb9cf6b682e",
  "timestamp": 1540914946026,
  "datetime": "2018-10-30 15:55:46",
  "payload": [
    {
      "cityName": "Abcoude",
      "locations": [
        {
          "address": "asterlaan 28",
          "geoPoint": {
            "latitude": 52.27312,
            "longitude": 4.96768
          },
          "cars": [
            {
              "mod": "BMW",
              "state": "FREE"
            }
          ]
        }
      ],
      "availableCars": 1,
      "occupiedCars": 0
    },
    {
      "cityName": "Alkmaar",
      "locations": [
        {
          "address": "Aert de Gelderlaan 14",
          "geoPoint": {
            "latitude": 52.63131,
            "longitude": 4.72329
          },
          "cars": [
            {
              "model": "Volswagen",
              "state": "FREE"
            }
          ]
        },
        {
          "address": "Ardennenstraat 49",
          "geoPoint": {
            "latitude": 52.66721,
            "longitude": 4.76046
          },
          "cars": [
            {
              "mod": "BMW",
              "state": "FREE"
            }
          ]
        },
        {
          "address": "Beneluxplein 7",
          "geoPoint": {
            "latitude": 52.65356,
            "longitude": 4.75817
          },
          "cars": [
            {
              "mod": "BMW",
              "state": "FREE"
            }
          ]
        },
        {
          "address": "Dr. Schaepmankade 1",
          "geoPoint": {
            "latitude": 52.62595,
            "longitude": 4.75122
          },
          "cars": [
            {
              "mod": "BMW",
              "state": "OCCUPIED"
            }
          ]
        },
        {
          "address": "Kennemerstraatweg",
          "geoPoint": {
            "latitude": 52.62909,
            "longitude": 4.74226
          },
          "cars": [
            {
              "model": "Mercedes",
              "state": "FREE"
            }
          ]
        },
        {
          "address": "NS Station Alkmaar Noord/Parkeerterrein Noord",
          "geoPoint": {
            "latitude": 52.64366,
            "longitude": 4.7627
          },
          "cars": [
            {
              "model": "Tesla",
              "state": "FREE"
            }
          ]
        },
        {
          "address": "NS Station Alkmaar/Stationsweg 56",
          "geoPoint": {
            "latitude": 52.6371,
            "longitude": 4.73935
          },
          "cars": [
            {
              "model": "Tesla",
              "state": "FREE"
            }
          ]
        },
        {
          "address": "Oude Hoeverweg",
          "geoPoint": {
            "latitude": 52.63943,
            "longitude": 4.72928
          },
          "cars": [
            {
              "model": "Tesla",
              "state": "FREE"
            }
          ]
        },
        {
          "address": "Parkeerterrein Wortelsteeg",
          "geoPoint": {
            "latitude": 52.63048,
            "longitude": 4.75487
          },
          "cars": [
            {
              "model": "Tesla",
              "state": "OCCUPIED"
            }
          ]
        },
        {
          "address": "Schoklandstraat 38",
          "geoPoint": {
            "latitude": 52.65812,
            "longitude": 4.75359
          },
          "cars": [
            {
              "model": "Volkswagen",
              "state": "FREE"
            }
          ]
        }
      ],
      "availableCars": 8,
      "occupiedCars": 2
    }
  ]
}

As you can see, the payload field is an array that has several objects (FYI: every object in this array represents one specific city: there are 1600 cities, so 1600 nested objects inside the payload array). Furthermore, inside each of the 1600 objects mentioned, other arrays and objects are again nested inside. For all objects in the payload array, the first field is cityName.

Furthermore, there is a nested array locations (inside each of the 1600 objects of the payload array) representing all addresses in a specific city. The locations array can be of size 1 to 600, meaning 1 to 600 nested objects / addresses per city. The last two fields in all objects of the payload array are availableCars and occupiedCars.

I want query documents to see how many cars are available and occupied for a specific city during a specific time interval. To do this:

For example, in this simplified example, I would like to query for the status information (availableCars & `occupiedCars) for the city of Alkmaar from 1540914946026 (epoch time) until now. I would like to get the following result:

{
 "id":"1540914946026",
 "cityName":"Alkmaar",
 "availableCars":8,
 "occupiedCars":2
 }

This is just an example, in reality, I want to be able to query for other cities as well, or query for several cities together and then get for each of those cities the number of available cars availableCars and the number of occupied cars occupiedCars.

Could anyone help me to define a query and index to be able to get the above result? Can I do this with cloudant query?

Upvotes: 0

Views: 484

Answers (1)

xpqz
xpqz

Reputation: 3737

Your data model does not play to Cloudant's strengths. Let each document group data that changes and is accessed together. Your items in your payload array would be much better stored as discrete documents.

If you find yourself reaching into growing arrays inside documents for subsets of data, this is a warning sign that your data model is not ideal: the document is now mutable and growing (with potential update conflicts as a result), and access becomes more cumbersome over time as Cloudant has no mechanism to only retrieve parts of a document. Moreover, Cloudant has a limit (1M) on document size, so by using your proposed model, you will likely hit that limit, too, and your application would stop working.

With that said, it is possible to create a view index that lets you emit each component of your payload, which would let you look up data per city -- but that solution is still subject to all the limitations above (document model is mutable, documents grow large etc).

Rule of thumb: small documents. Immutable model, where possible. Documents group data that either change, or are accessed as a unit.

Upvotes: 2

Related Questions