Reputation: 1775
I am using a nested data structure (array) to store multivalued attributes for Spark table. I am using array_contains(array, value) in Spark SQL to check if the array contains the value but it seems there is a performance issue. It takes a lot of time for a large Spark table. Is there any alternative solution to this.
Upvotes: 0
Views: 1899
Reputation: 25939
you didn't supply a lot of details on what exactly you are doing - if you are accessing the values inside the array a lot if can be beneficial to add columns with the value(s) from the array, e.g. by using explode
Upvotes: 1