user12625679
user12625679

Reputation: 696

Extract value from array in Spark

I am trying to extract a value from an array in SparkSQL, but getting the error below:

Example column

customer_details  
{"original_customer_id":"ch_382820","first_customer_id":"ch_343948"}

I am using this code:

get_json_object(customer_details, '$.original_customer_id') as customer_id

But I am getting the following error:

error: invalid string interpolation $., expected: $$, $identifier or ${expression}
       spark.sql(s"""

error: unclosed character literal (or use " not ' for string literal)
           get_json_object(customer_details, '$.original_customer_id') as customer_id,

Upvotes: 1

Views: 513

Answers (1)

ZygD
ZygD

Reputation: 24356

To me the following worked:

val df = Seq("{'original_customer_id':'ch_382820','first_customer_id':'ch_343948'}").toDF("customer_details")
df.show(truncate=false)
// +--------------------------------------------------------------------+
// |customer_details                                                    |
// +--------------------------------------------------------------------+
// |{'original_customer_id':'ch_382820','first_customer_id':'ch_343948'}|
// +--------------------------------------------------------------------+

df.selectExpr("get_json_object(customer_details, '$.original_customer_id') as customer_id").show()
// +-----------+
// |customer_id|
// +-----------+
// |  ch_382820|
// +-----------+

As requested, this is Spark SQL version:

select get_json_object(customer_details, '$.original_customer_id') as customer_id
from df
df.createOrReplaceTempView("df")

spark.sql(
    """
    select get_json_object(customer_details, '$.original_customer_id') as customer_id
    from df
    """
).show()
// +-----------+
// |customer_id|
// +-----------+
// |  ch_382820|
// +-----------+

Upvotes: 1

Related Questions