Eric Russell
Eric Russell

Reputation: 115

How to import documents with nested collections from SQL Server into DocumentDB using Data Migration tool

I am attempting to use the Cosmos DB Data Migration tool to import from SQL Server 2014 to DocumentDB. The following is a sample SELECT statement:

SELECT 
Sales.SalesOrderNumber AS [ID]
, Product.ProductName AS [Product.Name]
, Product.UnitPrice AS [Product.Price]
, Sales.SalesQuantity AS [Product.Quantity]
FROM ContosoRetailDW.dbo.FactOnlineSales AS Sales 
JOIN ContosoRetailDW.dbo.DimProduct AS Product ON Product.ProductKey = Sales.ProductKey
WHERE Sales.SalesOrderNumber IN ('20070326214955','20070220416329')
ORDER BY Sales.SalesOrderNumber;

Here is a sample rowset from the above query. I have added Product. prefix to the DimProduct related columns, because I want Product to be a nested collection.

ID                   Product.Name                                     Product.Price         Product.Quantity
-------------------- ------------------------------------------------ --------------------- ----------------
20070207721039       MGS Hand Games women M400 Yellow                8.99                  1
20070207721039       Adventure Works 26" 720p LCD HDTV M140 Silver   469.97                1
20070326214955       Adventure Works 20\" Analog CRT TV E45 Brown    200                   1
20070326214955       Contoso 4G MP3 Player E400 Silver               59.99                 1

Given the above sample rowset, here is an example of how I want my JSON document to be formatted:

[
  {
    "ID": "20070220416329",
    "Products": [
      {
        "ProductName": "Contoso Mini Battery Charger Kit E320 Silver",
        "Price": 24.99,
        "Quantity": 1
      },
      {
        "ProductName": "Adventure Works 26\" 720p LCD HDTV M140 Silver",
        "Price": 469.97,
        "Quantity": 1
      }
    ]
  },
  {
    "ID": "20070326214955",
    "Products": [
      {
        "ProductName": "Adventure Works 20\" Analog CRT TV E45 Brown",
        "Price": 200,
        "Quantity": 1
      },
      {
        "ProductName": "Contoso 4G MP3 Player E400 Silver",
        "Price": 59.99,
        "Quantity": 1
      }
    ]
  }
]

The problem is that each row is getting inserted a separate document, meaning (4) documents instead of (2), where Product is a nested document rather than a nested collection.

How do I accomplish what I'm trying to do?

Upvotes: 4

Views: 696

Answers (1)

Jay Gong
Jay Gong

Reputation: 23782

According to the sample on the official document , the nesting separator property is used to create hierarchical relationships (sub-documents) during import. However , generating array does not appear to be supported.

So , I suggest you querying the data from the SQL database and assemble it into the JSON data you want, then write to the JSON file and import the JSON file directly through the Data Migration tool.

query data python code:

import pyodbc
import os
from os.path import join as pjoin
import json

cnxn = pyodbc.connect('DRIVER={ODBC Driver 13 for SQL Server};SERVER=***.database.windows.net;DATABASE=***;UID=***;PWD=***')

cursor = cnxn.cursor()

cursor.execute("select * from dbo.test")
rowList = cursor.fetchall()

My sample json data:

[{
    "name": "jay",
    "courses": [{
        "course": "maths",
        "score": 100
    }, {
        "course": "history",
        "score": 80
    }]
}, {
    "name": "peter",
    "courses": [{
        "course": "maths",
        "score": 100
    }, {
        "course": "history",
        "score": 80
    }]
}]

assemble json data java code:

        boolean flag = true;
        List list= new ArrayList();
        String nameIndex = "";

       JSONObject obj = new JSONObject();
        while(rs.next()){
            String name = rs.getString("name");
            String course = rs.getString("course");
            int score = rs.getInt("score");
            if(!name.equals(nameIndex)){
                if(!flag){
                   list.add(obj);
                   obj = new JSONObject();
                   flag = false;
                }
                obj.put("name",name);
                List cources= new ArrayList();
                JSONObject objSub = new JSONObject();
                objSub.put("course",course);
                objSub.put("score",score);
                courses.add(objSub);
                obj.put("courses",courses);

            }else{

                List courses= (List)obj.get("courses");
                JSONObject objSub = new JSONObject();
                objSub.put("course",course);
                objSub.put("score",score);
                cources.add(objSub);
                obj.put("cources",cources);
            }
        }
        return list;

write data into json.file:

name_emb = {'your json string'}
output_dir = 'E:/'
listdir = os.listdir(output_dir)
if 'test.json' in listdir:
   fr = open(pjoin(output_dir, 'test.json'), 'a')
   model = json.dumps(name_emb)
   fr.write(model)
   fr.close()

Hope it helps you.

Upvotes: 1

Related Questions