Alex Klaus
Alex Klaus

Reputation: 8934

Many-to-many relationship RavenDb: Document structure and index

How to you build a NoSQL model and index (preferably for RavenDb v4) for the following relational schema?

Document type Contact, where each record can have multiple additional properties (type of the property is defined in CustomField and the value in ContactCustomField) enter image description here

Considering a need to filter/sort on the highlighted fields in one query (all fields from the Contact plus custom fields).


Possible options as I see:

Option #1

Naturally, I'd imagine the following persistent models:

public class Contact
{
    public string Id      { get; set; }
    public string Name    { get; set; }
    public string Address { get; set; }
    public string Phone   { get; set; }
    // Where the key is CustomField.Id and the value is ContactCustomField.Value
    public Dictionary<string, string> CustomValues { get; set; }
}

public class CustomField
{
    public string Id          { get; set; }
    public string Code        { get; set; }
    public string DataType    { get; set; }
    public string Description { get; set; }
}

However, building an index for a query like below (sorry for a mixed syntax) puzzles me:

SELECT Name, Address, Phone, CustomValues
FROM Contact
WHERE Name LIKE '*John*' AND CustomValues.Any(v => v.Key == "11" && v.Value == "student")

Option #2

Another approach would be keeping a normalised structure (as shown on the picture above). Then it would work - I'd just have to include ContactCustomField in the query for Contact.

The downside would be not utilising benefits of NoSQL.

Upvotes: 1

Views: 630

Answers (2)

Alex Klaus
Alex Klaus

Reputation: 8934

Updated answer (29 June 2018)

The key to success is in one undervalued Raven's feature - Indexes with Dynamic Fields. It allows to keep logical data structure and avoids creating a fanout index.

The way to use is to build collections like described above in option #1:

public class Contact
{
    public string Id      { get; set; }
    public string Name    { get; set; }
    public string Address { get; set; }
    public string Phone   { get; set; }
    public Dictionary<string, object> CustomFields { get; set; }
}

public class CustomField
{
    public string Id          { get; set; }
    public string Code        { get; set; }
    public string DataType    { get; set; }
    public string Description { get; set; }
}

where Contact.CustomFields.Key is a reference to CustonField.Id and Contact.CustomFields.Value stores a value for that custom field.

In order to filter/search on the custom fields, we need the following index:

public class MyIndex : AbstractIndexCreationTask<Contact>
{
    public MyIndex()
    {
        Map = contacts =>
            from e in contacts
            select new
            {
                _ = e.CustomFields.Select( x => CreateField ($"{nameof(Contact.CustomFields)}_{x.Key}", x.Value))
            };
    }
} 

That index will cover all key-value pairs of the dictionary as they were ordinary properties of the Contact.

Gotcha

There is a big gotcha if you write queries in C# with using the usual Query object (IRavenQueryable type), rather than RQL or DocumentQuery. It's in the way we named the dynamic fields - it's a compound name in specific format: dictionary_name + underscore + key_name. It allows us to build queries like

var q = s.Query<Person, MyIndex>()
                .Where(p => p.CustomFields["Age"].Equals(4));

Which under the hood gets converted into RQL:

from index 'MyIndex' where CustomFields_Age = $p1

It's undocumented and here is my discussion with Oren Eini (aka Ayende Rahien) where you can learn more on the subject.

P.S. My general recommendation would be to interact with Raven via DocumentQuery rather than the usual Query (link), as LINQ integration is still quite weak and devs may keep stumbling upon bugs here and there.


Initial answer (9 June 2018)

As it was suggested by Oren Eini (aka Ayende Rahien), the way to go is option #2 - including a separate ContactCustomField collection in the queries.

So in spite of using a NoSQL database, relational approach is the only way to go here.

Upvotes: 2

user2292759
user2292759

Reputation: 159

For this you probably want to use Map-Reduced indexes.

Map:

docs.Contacts.SelectMany(doc => (doc, next) => new{
// Contact Fields
doc.Id,
doc.Name,
doc.Address,
doc.Phone,
doc.CustomFieldLoaded = LoadDocument<string>(doc.CustomValueField, "CustomFieldLoaded"),
doc.CustomValues
});

Reduce:

from result in results
group result by {result.Id, result.Name, result.Address, result.Phone, result.CustomValues, result.CustomFieldLoaded} into g
select new{
g.Key.Id,
g.Key.Name,
g.Key.Address,
g.Key.Phone,
g.Key.CustomFieldLoaded = new {},
g.Key.CustomValues = g.CustomValues.Select(c=> g.Key.CustomFieldLoaded[g.Key.CustomValues.IndexOf(c)])
}

Your document would look something like this:

{
"Name": "John Doe",
"Address": "1234 Elm St",
"Phone": "000-000-0000",
CustomValues: "{COLLECTION}/{DOCUMENTID}"
}

This would load the contact and then load the relational documents' data.

I haven't tested this exact example but it's based off of a working example that I have implemented in my own project. You may need to do some tweaking.

You will of course need to tweak it to include many documents but it should give you a basic idea of how to use relationships.

You should also checkout documentation for document relationships.

I hope this helps.

Upvotes: 0

Related Questions