Reputation: 4666
In one of my project I received customer order details at the middle of each month which is an about 14 billion lines file. I need to upload them into my system (1 line per record) within 1 week then users can query.
I decided to use table storage to store based on price and performance consideration. But I found the performance of table storage is "2000 entities per second per partition" and "20,000 entities per second per account". https://azure.microsoft.com/en-us/documentation/articles/storage-scalability-targets/
This means if I was using 1 storage account I need about 1 month to upload them which is not acceptable.
Is there any solution I can speed up to finish the upload task within 1 week?
Upvotes: 4
Views: 2489
Reputation: 12228
The simple answer to this is to use multiple storage accounts. If you partition the data and stripe it across multiple storage accounts you can get as much performance as you need from it. You just need another layer to aggregate the data afterwards.
You could potentially have a slower process that is creating one large master table in the background.
You may have found this already, but there is an excellent article about importing large datasets into Azure Tables
Upvotes: 2