urig
urig

Reputation: 16851

How to merge DB rows into a document?

The setup for this question is a bit elaborate, please bear with me :)

My scenario is that of migrating data from relational table form to document collection form ("ETL"). My source data is represented in a relational table where each row describes a "Person" and one of his/her "Addresses". (Naturally this table isn't normalized).

A simplified example would be:

PersonId PersonName AddressId AddressText
1        John Doe   1         Somewhere
1        John Doe   2         Elsewhere
2        Jane Doe   3         Some other place

To complicate things, some of the rows can update previous existing values. Ex:

PersonId PersonName AddressId AddressText
1        John Doe   1         A new address

I would like to migrate this data into a document collection in a document store (DB) for easy retrieval of specific Person entities with all their related Addresses. To begin with I am considering a single document per Person, like so:

{
   "id":"1",
   "addresses":[
      {
         "id":"1",
         "text":"A new address"
      },
      {
         "id":"2",
         "text":"Elsewhere"
      }
   ]
}

MY QUESTION IS: What is the best way to merge the table rows into a document?

Is the best approach to take a document store that supports partial updates (MongoDB, RavenDB) and use that feature?

Or maybe I should fetch the document as a whole from the document DB, update it in my business logic layer and save it whole back to the DB?

Or should I aim for a different document representation of the data altogether?

Upvotes: 1

Views: 91

Answers (2)

Jens Pettersson
Jens Pettersson

Reputation: 1177

You might want to ask yourself WHY you want to do a "partial update" on a document. If the requirement is that you should be able to change/update a person's address you might want to model that behaviour in the Person entity (kind of what you say with "update it in my business logic layer").

Changing address information seems like a domain responsibility and should (imo) be modeled accordingly. A quick example using RavenDb in C#:

using (var session = _documentStore.OpenSession())
{
    var person = session.Load<Person>("persons/1");
    person.UpdateAddressInformation(someAddressInformation);

    session.SaveChanges();
}

But if you really need to do partial document updates, you can do that with RavenDb as well:

http://ravendb.net/docs/article-page/2.5/csharp/client-api/partial-document-updates

//J

Upvotes: 1

datasci
datasci

Reputation: 1029

Your document could look something like this in MongoDB:

{
    person_id:PersonId,
    name: PersonName,
    addresses:[{
              id:AddressId1,
              street:AddressText1
              },
              {
              id:AddressId2,
              street:AddressText2
              },
              ...
              {
              id:AddressIdn,
              street:AddressTextn
              }]
}

Then, we could update the second address this way:

db.collection.update({person_id:PersonId,"addresses.id":AddressId2},{$set:{"addresses.$.street":NewAddress}})

See here for additional detail: http://docs.mongodb.org/manual/reference/operator/update/positional/#update-embedded-documents-using-multiple-field-matches

Hope this helps.

Upvotes: 1

Related Questions