Reputation: 8934
How to you build a NoSQL model and index (preferably for RavenDb v4) for the following relational schema?
Document type
Contact
, where each record can have multiple additional properties (type of the property is defined inCustomField
and the value inContactCustomField
)
Considering a need to filter/sort on the highlighted fields in one query (all fields from the Contact plus custom fields).
Possible options as I see:
Option #1
Naturally, I'd imagine the following persistent models:
public class Contact
{
public string Id { get; set; }
public string Name { get; set; }
public string Address { get; set; }
public string Phone { get; set; }
// Where the key is CustomField.Id and the value is ContactCustomField.Value
public Dictionary<string, string> CustomValues { get; set; }
}
public class CustomField
{
public string Id { get; set; }
public string Code { get; set; }
public string DataType { get; set; }
public string Description { get; set; }
}
However, building an index for a query like below (sorry for a mixed syntax) puzzles me:
SELECT Name, Address, Phone, CustomValues
FROM Contact
WHERE Name LIKE '*John*' AND CustomValues.Any(v => v.Key == "11" && v.Value == "student")
Option #2
Another approach would be keeping a normalised structure (as shown on the picture above). Then it would work - I'd just have to include ContactCustomField
in the query for Contact
.
The downside would be not utilising benefits of NoSQL.
Upvotes: 1
Views: 630
Reputation: 8934
Updated answer (29 June 2018)
The key to success is in one undervalued Raven's feature - Indexes with Dynamic Fields. It allows to keep logical data structure and avoids creating a fanout index.
The way to use is to build collections like described above in option #1:
public class Contact
{
public string Id { get; set; }
public string Name { get; set; }
public string Address { get; set; }
public string Phone { get; set; }
public Dictionary<string, object> CustomFields { get; set; }
}
public class CustomField
{
public string Id { get; set; }
public string Code { get; set; }
public string DataType { get; set; }
public string Description { get; set; }
}
where Contact.CustomFields.Key
is a reference to CustonField.Id
and Contact.CustomFields.Value
stores a value for that custom field.
In order to filter/search on the custom fields, we need the following index:
public class MyIndex : AbstractIndexCreationTask<Contact>
{
public MyIndex()
{
Map = contacts =>
from e in contacts
select new
{
_ = e.CustomFields.Select( x => CreateField ($"{nameof(Contact.CustomFields)}_{x.Key}", x.Value))
};
}
}
That index will cover all key-value pairs of the dictionary as they were ordinary properties of the Contact
.
Gotcha
There is a big gotcha if you write queries in C# with using the usual Query object (IRavenQueryable
type), rather than RQL
or DocumentQuery
. It's in the way we named the dynamic fields - it's a compound name in specific format: dictionary_name + underscore + key_name
. It allows us to build queries like
var q = s.Query<Person, MyIndex>()
.Where(p => p.CustomFields["Age"].Equals(4));
Which under the hood gets converted into RQL:
from index 'MyIndex' where CustomFields_Age = $p1
It's undocumented and here is my discussion with Oren Eini (aka Ayende Rahien) where you can learn more on the subject.
P.S. My general recommendation would be to interact with Raven via DocumentQuery
rather than the usual Query
(link), as LINQ integration is still quite weak and devs may keep stumbling upon bugs here and there.
Initial answer (9 June 2018)
As it was suggested by Oren Eini (aka Ayende Rahien), the way to go is option #2 - including a separate ContactCustomField
collection in the queries.
So in spite of using a NoSQL database, relational approach is the only way to go here.
Upvotes: 2
Reputation: 159
For this you probably want to use Map-Reduced indexes.
Map:
docs.Contacts.SelectMany(doc => (doc, next) => new{
// Contact Fields
doc.Id,
doc.Name,
doc.Address,
doc.Phone,
doc.CustomFieldLoaded = LoadDocument<string>(doc.CustomValueField, "CustomFieldLoaded"),
doc.CustomValues
});
Reduce:
from result in results
group result by {result.Id, result.Name, result.Address, result.Phone, result.CustomValues, result.CustomFieldLoaded} into g
select new{
g.Key.Id,
g.Key.Name,
g.Key.Address,
g.Key.Phone,
g.Key.CustomFieldLoaded = new {},
g.Key.CustomValues = g.CustomValues.Select(c=> g.Key.CustomFieldLoaded[g.Key.CustomValues.IndexOf(c)])
}
Your document would look something like this:
{
"Name": "John Doe",
"Address": "1234 Elm St",
"Phone": "000-000-0000",
CustomValues: "{COLLECTION}/{DOCUMENTID}"
}
This would load the contact and then load the relational documents' data.
I haven't tested this exact example but it's based off of a working example that I have implemented in my own project. You may need to do some tweaking.
You will of course need to tweak it to include many documents but it should give you a basic idea of how to use relationships.
You should also checkout documentation for document relationships.
I hope this helps.
Upvotes: 0