Reputation: 115
I am attempting to use the Cosmos DB Data Migration tool to import from SQL Server 2014 to DocumentDB. The following is a sample SELECT statement:
SELECT
Sales.SalesOrderNumber AS [ID]
, Product.ProductName AS [Product.Name]
, Product.UnitPrice AS [Product.Price]
, Sales.SalesQuantity AS [Product.Quantity]
FROM ContosoRetailDW.dbo.FactOnlineSales AS Sales
JOIN ContosoRetailDW.dbo.DimProduct AS Product ON Product.ProductKey = Sales.ProductKey
WHERE Sales.SalesOrderNumber IN ('20070326214955','20070220416329')
ORDER BY Sales.SalesOrderNumber;
Here is a sample rowset from the above query. I have added Product. prefix to the DimProduct related columns, because I want Product to be a nested collection.
ID Product.Name Product.Price Product.Quantity
-------------------- ------------------------------------------------ --------------------- ----------------
20070207721039 MGS Hand Games women M400 Yellow 8.99 1
20070207721039 Adventure Works 26" 720p LCD HDTV M140 Silver 469.97 1
20070326214955 Adventure Works 20\" Analog CRT TV E45 Brown 200 1
20070326214955 Contoso 4G MP3 Player E400 Silver 59.99 1
Given the above sample rowset, here is an example of how I want my JSON document to be formatted:
[
{
"ID": "20070220416329",
"Products": [
{
"ProductName": "Contoso Mini Battery Charger Kit E320 Silver",
"Price": 24.99,
"Quantity": 1
},
{
"ProductName": "Adventure Works 26\" 720p LCD HDTV M140 Silver",
"Price": 469.97,
"Quantity": 1
}
]
},
{
"ID": "20070326214955",
"Products": [
{
"ProductName": "Adventure Works 20\" Analog CRT TV E45 Brown",
"Price": 200,
"Quantity": 1
},
{
"ProductName": "Contoso 4G MP3 Player E400 Silver",
"Price": 59.99,
"Quantity": 1
}
]
}
]
The problem is that each row is getting inserted a separate document, meaning (4) documents instead of (2), where Product is a nested document rather than a nested collection.
How do I accomplish what I'm trying to do?
Upvotes: 4
Views: 696
Reputation: 23782
According to the sample on the official document , the nesting separator property is used to create hierarchical relationships (sub-documents) during import. However , generating array does not appear to be supported.
So , I suggest you querying the data from the SQL database and assemble it into the JSON data you want, then write to the JSON file and import the JSON file directly through the Data Migration tool.
query data python code:
import pyodbc
import os
from os.path import join as pjoin
import json
cnxn = pyodbc.connect('DRIVER={ODBC Driver 13 for SQL Server};SERVER=***.database.windows.net;DATABASE=***;UID=***;PWD=***')
cursor = cnxn.cursor()
cursor.execute("select * from dbo.test")
rowList = cursor.fetchall()
My sample json data:
[{
"name": "jay",
"courses": [{
"course": "maths",
"score": 100
}, {
"course": "history",
"score": 80
}]
}, {
"name": "peter",
"courses": [{
"course": "maths",
"score": 100
}, {
"course": "history",
"score": 80
}]
}]
assemble json data java code:
boolean flag = true;
List list= new ArrayList();
String nameIndex = "";
JSONObject obj = new JSONObject();
while(rs.next()){
String name = rs.getString("name");
String course = rs.getString("course");
int score = rs.getInt("score");
if(!name.equals(nameIndex)){
if(!flag){
list.add(obj);
obj = new JSONObject();
flag = false;
}
obj.put("name",name);
List cources= new ArrayList();
JSONObject objSub = new JSONObject();
objSub.put("course",course);
objSub.put("score",score);
courses.add(objSub);
obj.put("courses",courses);
}else{
List courses= (List)obj.get("courses");
JSONObject objSub = new JSONObject();
objSub.put("course",course);
objSub.put("score",score);
cources.add(objSub);
obj.put("cources",cources);
}
}
return list;
write data into json.file:
name_emb = {'your json string'}
output_dir = 'E:/'
listdir = os.listdir(output_dir)
if 'test.json' in listdir:
fr = open(pjoin(output_dir, 'test.json'), 'a')
model = json.dumps(name_emb)
fr.write(model)
fr.close()
Hope it helps you.
Upvotes: 1