NoSQL database design with one-to-many relationships and picking partition key

Question

I am undertaking NoSQL document design for the below tables:

I have below tables where I am trying to do data modelling:

TaxType
LocationType
Location
TaxAssign

I have above tables in SQL server relational database where we maintain TaxAssignments for Items and scancodes, below is the sample data for TaxType & Location (Location table is self referential table with Fk_LocationId) table:

Below is the table for TaxAssignments:

I am trying to convert above SQL tables into NoSQL document DB, it's a one-to-may relationships between TaxType and TaxAssign, Location & TaxAssign most of the time from above table the queries are based on (FK_RetailItemCode or FK_TaxTypeCode), or by (ScanCode or FK_TaxTypeCode)

I want to design the document json, but it's been very hard for me to pick the partition key, ItemCode, ScanCode are queried a lot but they are optional fields so I cannot include them as part of partition key, so I picked UniqueIdentifer as partitionkey to spread out data into multiple logical partitions.

Did I pick the right key? When I query I don't query by UniqueIdentifier but by ItemCode or ScanCode with TaxType optional.

Below is my JSON document, are there any modifications or changes required in the design or should I take a different approach in order to design this:

    {
    "UniqueIdentifier": "1999-10-20-07.55.05.090087",
    "EffectiveDate": "1999-10-20",
    "TerminationDate": "9999-12-31",
    "LocationId": 1,
    "FK_RetailItemCode": 852874,
    "FK_TaxTypeCode": 1,
    "TaxType": [
        {
            "TaxTypeCode": 1,
            "TaxArea": "STATE ",
            "Description": "SALES TAX                                                   ",
            "IsPOSTaxEnabled": "Y",
            "POSTaxField": 1,
            "TaxOrder": 0,
            "IsCityLimit": " ",
            "IsTaxFree": false,
            "Location": [
                {
                    "LocationId": 1,
                    "LocationType": "ST",
                    "City": "                            ",
                    "County": "                         ",
                    "Country": "USA                 ",
                    "CountyCode": 0,
                    "State": "ALABAMA             ",
                    "StateShortName": "AL",
                    "SortSequence": 40
                }
            ]
        }
    ]
},
{
    "UniqueIdentifier": "2019-06-13-08.51.48.004124",
    "EffectiveDate": "2019-06-13",
    "TerminationDate": "2019-08-05",
    "LocationId": 13531,
    "FK_RetailItemCode": 852784,
    "FK_TaxTypeCode": 16,
    "TaxType": [
        {
            "TaxTypeCode": 16,
            "TaxArea": "CITY  ",
            "Description": "HOSPTLY TAX OUT CITY LIM                                    ",
            "IsPOSTaxEnabled": "Y",
            "POSTaxField": 2,
            "TaxOrder": 1,
            "IsCityLimit": "N",
            "IsTaxFree": false,
            "Location": [
                {
                    "LocationId": 13531,
                    "LocationType": "CI",
                    "City": "FOLEY                       ",
                    "County": "BALDWIN                  ",
                    "Country": "USA                 ",
                    "CountyCode": 2,
                    "State": "ALABAMA             ",
                    "StateShortName": "AL",
                    "FK_LocationId": 13510
                }
            ]
        }
    ]
}

TaxAssignment is a huge table with 6 millon data, so I want to spread data as much as I can so I picked UniqueIdentifier as partition key, I couldn't pick the other partition key which are queried so often as that columns ItemCode & ScanCode are optional (nullable).

Questions:

As I have one-to-many relationship, can I embed location & taxType inside each TaxAssignment.
Is it OK to pick UniqueIdentifier as partition key even though the partition key is never used to query against the collection.
Should I denormalize the whole Json with TaxType & Location instead of embedding them inside each tax assignnment.
for any changes to taxtype and location metadata, I might need to make changes to taxtype and location in a lot of places. What design approaches can I use here?

TaxType --> number of records is 19

Location --> number of records is 38000

TaxAssign --> number of records is 6 million.

NoSQL database design with one-to-many relationships and picking partition key

Answers (1)

Related Questions