Reputation: 301
I have a scenario where multiple users will be doing import processes, but all of them will be working for different clients.
I have one core table which gets the most hits whenever import processes run. I have 2 options now
Can anyone suggest which one is better and why?
I am giving my requirements in more detailed this time. Can you all once again have a look at and provide your comments after going through the requirements.
The query is regarding data modeling for core functionality of my application.
I have a scenario where multiple users will be doing import processes, but all of them will be working for different clients. Also, at the same time client's data could be shown to the user and can be modified/inserted too, while the import process for the same or different client is in process.
I have 2 core tables which get the most hits whenever import processes run.
I have 2 options now 1. To have 2 core tables and do the sequential imports by making queue for the import processes.
Table 1
ID ClientID SourceID Count AnotherCol1 AnotherCol2 AnotherCol3
Table 2
ID ClientID OrderID Count AnotherCol4 AnotherCol5 AnotherCol6
More information about the import process: 1. These table is not going to be used in any Reporting. 2. Each import process will insert 20k-30k records (7 columns) in these each table. And there will be around 40-50 such imports in a day. 3. While the import process is going on, data could be retrieved from these tables by some other user and INSERT OR UPDATED too. 4. These are going to be one of the most usable tables in the application. 5. BULK INSERT will be used for insertion. 6. Clustered index is on the Primary Key which is an Identity column. 7. We are considering the table partitioning too.
Can you please suggest which option is better and why?
Also, if you suggest to go with option 2, then would it not be a performance hit to create so many tables in the database? Should we create a separate database for these 1000 tables in this case?
I am giving my requirements in more detailed this time. Can you all once again have a look at and provide your comments after going through the requirements.
The query is regarding data modeling for core functionality of my application.
I have a scenario where multiple users will be doing import processes, but all of them will be working for different clients. Also, at the same time client's data could be shown to the user and can be modified/inserted too, while the import process for the same or different client is in process.
I have 2 core tables which get the most hits whenever import processes run.
I have 2 options now 1. To have 2 core tables and do the sequential imports by making queue for the import processes.
Table 1
ID ClientID SourceID Count AnotherCol1 AnotherCol2 AnotherCol3
Table 2
ID ClientID OrderID Count AnotherCol4 AnotherCol5 AnotherCol6
More information about the import process: 1. These table is not going to be used in any Reporting. 2. Each import process will insert 20k-30k records (7 columns) in these each table. And there will be around 40-50 such imports in a day. 3. While the import process is going on, data could be retrieved from these tables by some other user and INSERT OR UPDATED too. 4. These are going to be one of the most usable tables in the application. 5. BULK INSERT will be used for insertion. 6. Clustered index is on the Primary Key which is an Identity column. 7. We are considering the table partitioning too.
Can you please suggest which option is better and why?
Also, if you suggest to go with option 2, then would it not be a performance hit to create so many tables in the database? Should we create a separate database for these 1000 tables in this case?
Upvotes: 3
Views: 1235
Reputation: 89671
Another option is a third scenario where you have a single table, but you still can do the imports in parallel by having a batch identifier in the table which stops people stepping on each other.
The main problem with having multiple people in the same table is that you cannot do things like TRUNCATE.
For me the decision would be related to about where the data eventually goes. Is this just a staging table for convenience because there is going to be some SQL for transform or lookup run against it after load? Would it be possible to make such tables in a separate database or schema and with unique names which would make it easy for them to be cleaned up without interfering with or bloating the transaction log in your primary database> Do you need to insert in bulk, then apply indexes and eventually drop the table? Is such a table even necessary if you are using SSIS to load the data, you can often do a lot of work in the pipeline without needing a staging table?
All these things would play into my decision making process on the architecture.
Upvotes: 2
Reputation: 1611
It's not really a question with a definitive answer as each has it's own benefits and drawbacks.
Scenario 1: Central core table
Scenario 2: 300 core tables
In the end the answer is whatever really works for you
Upvotes: 3