Sparksql read json with inner array

Question

I'm trying to read json into dataset (spark 2.3.2). Unfortunately it doesn't works well .

Here is the data ,it's a json file with a inner array

{ "Name": "helloworld", 
  "info": { "privateInfo": [ {"salary":1200}, {"sex":"M"}],
            "house": "sky road" 
          }, 
  "otherinfo":2
}   
{ "Name": "helloworld2",
  "info": { "privateInfo": [ {"sex":"M"}],
            "house": "sky road" 
          }, 
  "otherinfo":3
}

I use sparksession to select the columns, but it has some problems: the result is not the data itself，but in an array.

val sqlDF = spark.sql("SELECT name , info.privateInfo.salary ,info.privateInfo.sex   FROM people1 ")
    sqlDF.show()

But columns salary & sex are in an array:

+-----------+-------+-----+
|       name| salary|  sex|
+-----------+-------+-----+
| helloworld|[1200,]|[, M]|
|helloworld2|     []|  [M]|
+-----------+-------+-----+

How could I get the data with the datatype itself?

Such as

+-----------+-------+-----+
|       name| salary|  sex|
+-----------+-------+-----+
| helloworld|  1200 |  M  |
|helloworld2|none/null| M |
+-----------+-------+-----+

Sparksql read json with inner array

Answers (1)

Related Questions