Reputation: 103
I'm looking for a way to query nested arrays in ArangoDB.
The JSON structure I have is:
{
"uid": "bykwwla4prqi",
"category": "party",
"notBefore": "2016-04-19T08:43:35.388+01:00",
"notAfter": "9999-12-31T23:59:59.999+01:00",
"version": 1.0,
"aspects": [
"participant"
],
"description": [
{ "value": "User Homer Simpson, main actor in 'The Simpsons'", "lang": "en"}
],
"properties": [
{
"property": [
"urn:project:domain:attribute:surname"
],
"values": [
"Simpson"
]
},
{
"property": [
"urn:project:domain:attribute:givennames"
],
"values": [
"Homer",
"Jay"
]
}
]
}
I tried to use a query like the following to find all parties having a given name 'Jay':
FOR r IN resource
FILTER "urn:project:domain:attribute:givennames" IN r.properties[*].targets[*]
AND "Jay" IN r.properties[*].values[*]
RETURN r
but unfortunately it does not work - it returns an empty array. If I use a '1' instead of '*' for the properties array it works. But the array of the properties has no fixed structure.
Does anybody have an idea how to solve this?
Thanks a lot!
Upvotes: 3
Views: 1782
Reputation: 1
I came across a similar problem. My data set was an array of objects called offers
, inside which another array called metaData
was present.
data Set Example :
offers : [{
Country : 1,
Status:1,
OfferCode :"TEST",
EndUtc : 1234455,
metaData :[{name : "isNewUSer",message :"yes"},
{name : "cohort",message :"bro please"}]
}]
I wanted to fetch all the offers whose metaData
had isNewUser
at least once.
Here is the sample query for this requirement.
for o in offers
let a = o.metaData
let c = (for m in a filter m.name == 'IsNewUser' return a)
filter LENGTH(c) > 0 and o.Country == "14" and o.Status ==1
return o.OfferCode
This will return all the OfferCode
s which has IsNewUser
in its metaData
.
Upvotes: 0
Reputation: 6067
You can inspect what the filter does using a simple trick: you RETURN
the actual filter condition:
db._query(`FOR r IN resource RETURN r.properties[*].property[*]`).toArray()
[
[
[
"urn:project:domain:attribute:surname"
],
[
"urn:project:domain:attribute:givennames"
]
]
]
which makes it pretty clear whats going on. The IN
operator can only work on one dimensional arrays. You could work around this by using FLATTEN()
to remove the sub layers:
db._query(`FOR r IN resource RETURN FLATTEN(r.properties[*].property[*])`).toArray()
[
[
"urn:project:domain:attribute:surname",
"urn:project:domain:attribute:givennames"
]
]
However, while your documents are valid json (I guess its converted from xml?) you should alter the structure as one would do it in json:
"properties" : {
"urn:project:domain:attribute:surname":[
"Simpson"
],
"urn:project:domain:attribute:givennames": [
"Homer",
"Jay"
]
}
Since the FILTER
combination you specify would also find any other Jay
(not only those found in givennames
) and the usage of FLATTEN()
will prohibit using indices in your filter statement. You don't want to use queries that can't use indices on reasonably sized collections for performance reasons.
In Contrast you can use an array index on givennames
with the above document layout:
db.resource.ensureIndex({type: "hash",
fields:
["properties.urn:project:domain:attribute:givennames[*]"]
})
Now doublecheck the explain for the query:
db._explain("FOR r IN resource FILTER 'Jay' IN " +
"r.properties.`urn:project:domain:attribute:givennames` RETURN r")
...
6 IndexNode 1 - FOR r IN resource /* hash index scan */
...
Indexes used:
By Type Collection Unique Sparse Selectivity Fields Ranges
6 hash resource false false 100.00 % \
[ `properties.urn:project:domain:attribute:givennames[*]` ] \
("Jay" in r.`properties`.`urn:project:domain:attribute:givennames`)
that its using the index.
Upvotes: 3