user1
user1

Reputation: 4131

How to parse JSON value of a text column in cassandra

I have a column of text type be contain JSON value.

{
    "customer": [
        {
            "details": {
                "customer1": {
                    "name": "john",
                    "addresses": {
                        "address1": {
                            "line1": "xyz",
                            "line2": "pqr"
                        },
                        "address2": {
                            "line1": "abc",
                            "line2": "efg"
                        }
                    }
                }
                "customer2": {
                    "name": "robin",
                    "addresses": {
                        "address1": null
                    }
                }
            }
        }
    ]
}

How can I extract 'address1' JSON field of column with query?

First I am trying to fetch JSON value then I will go with parsing.

SELECT JSON customer from text_column;

With my query, I get following error.

com.datastax.driver.core.exceptions.SyntaxError: line 1:12 no viable alternative at input 'customer' (SELECT [JSON] customer...)
com.datastax.driver.core.exceptions.SyntaxError: line 1:12 no viable alternative at input 'customer' (SELECT [JSON] customer...)

Cassandra version 2.1.13

Upvotes: 0

Views: 6968

Answers (2)

Maneesh K Bishnoi
Maneesh K Bishnoi

Reputation: 139

If your Cassandra version is 2.1x and below, you can use the Python-based approach. Write a python script using Cassandra-Python API

Here you have to get your row first and then use python json's loads method, which will convert your json text column value into JSON object which will be dict in Python. Then you can play around with Python dictionaries and extract your required nested keys. See the below code snippet.

from cassandra.cluster import Cluster
from cassandra.auth import PlainTextAuthProvider
import json

if __name__ == '__main__':
   
    auth_provider = PlainTextAuthProvider(username='xxxx', password='xxxx')
    cluster = Cluster(['0.0.0.0'],
                      port=9042, auth_provider=auth_provider)
    session = cluster.connect("keyspace_name")
    print("session created successfully")
    rows = session.execute('select * from user limit 10')

    for user_row in rows:
        customer_dict = json.loads(user_row.customer)
        print(customer_dict().keys()
        

Upvotes: 0

Ashraful Islam
Ashraful Islam

Reputation: 12830

You can't use SELECT JSON in Cassandra v2.1.x CQL v3.2.x

For Cassandra v2.1.x CQL v3.2.x :

The only supported operation after SELECT are :

  • DISTINCT
  • COUNT (*)
  • COUNT (1)
  • column_name AS new_name
  • WRITETIME (column_name)
  • TTL (column_name)
  • dateOf(), now(), minTimeuuid(), maxTimeuuid(), unixTimestampOf(), typeAsBlob() and blobAsType()

In Cassandra v2.2.x CQL v3.3.x Introduce : SELECT JSON

With SELECT statements, the new JSON keyword can be used to return each row as a single JSON encoded map. The remainder of the SELECT statment behavior is the same.

The result map keys are the same as the column names in a normal result set. For example, a statement like “SELECT JSON a, ttl(b) FROM ...” would result in a map with keys "a" and "ttl(b)". However, this is one notable exception: for symmetry with INSERT JSON behavior, case-sensitive column names with upper-case letters will be surrounded with double quotes. For example, “SELECT JSON myColumn FROM ...” would result in a map key "\"myColumn\"" (note the escaped quotes).

The map values will JSON-encoded representations (as described below) of the result set values.

Upvotes: 1

Related Questions