Connell.O'Donnell
Connell.O'Donnell

Reputation: 3723

Copy nested objects from SQL Server to Azure CosmosDB using a Data Factory

Let's say I have the following data structure:

public class Account
{
    public int AccountID { get; set; }
    public string Name { get; set; }
}

public class Person
{
    public int PersonID { get; set; }
    public string Name { get; set; }
    public List<Account> Accounts { get; set; }
}

I want to move my data from an SQL Server database to Azure Cosmos DB using a Data Factory. For each person, I want to create a json file containing the accounts as nested objects like this:

"PersonID": 1,
"Name": "Jim",
"Accounts": [{
    "AccountID": 1,
    "PersonID": 1,
    "Name": "Home"
},
{
    "AccountID": 2,
    "PersonID": 1,
    "Name": "Work"
}]

I wrote a stored procedure to retrieve my data. In order to include the accounts as nested objects, I convert the SQL query's result to json:

select (select *
from Person p join Account Accounts on Accounts.PersonID = p.PersonID
for json auto) as JsonResult

Unfortunately, my data gets copied into a single field instead of the proper object structure:

enter image description here

Does anyone know what I should do to fix this?

Edit There is a similar question here but I didn't find a good answer: Is there a way to insert a document with a nested array in Azure Data Factory?

Upvotes: 2

Views: 670

Answers (1)

Connell.O&#39;Donnell
Connell.O&#39;Donnell

Reputation: 3723

For anyone in the same situation, I ended up writing a .net application to read the entries from the database and import using the SQL API.

https://learn.microsoft.com/en-us/azure/cosmos-db/create-sql-api-dotnet

That method is little slow for large imports because it has to serialize the each object and then import them individually. A much faster way I found later is to use the bulk executor library which allows you to import json in bulk without serializing it first:

https://github.com/Azure/azure-cosmosdb-bulkexecutor-dotnet-getting-started

https://learn.microsoft.com/en-us/azure/cosmos-db/bulk-executor-overview

Edit

After installing the NuGet package Microsoft.Azure.CosmosDB.BulkExecutor:

var documentClient = new DocumentClient(new Uri(connectionConfig.Uri), connectionConfig.Key);
var dataCollection = documentClient.CreateDocumentCollectionQuery(UriFactory.CreateDatabaseUri(database))
    .Where(c => c.Id == collection)
    .AsEnumerable()
    .FirstOrDefault();

var bulkExecutor = new BulkExecutor(documentClient, dataCollection);
await bulkExecutor.InitializeAsync();

Then import the docs:

var response = await client.BulkIMportAsync(docunemts);

Upvotes: 3

Related Questions