peanut
peanut

Reputation: 109

Architecture for microsoft azure. CSV to SQL

I'm an intern in charge of researching an azure project.
At the moment i'm devising an architecture for a part of the project.

The goal is to convert multiple csv files to an SQL database in the cloud. These csv's will be sent from random locations in the country and need to be processed so the database can eventually be accessed using a web service.

I'm totally new to the azure scene and have been schooling myself but it's all a bit fuzzy in my head.

some info:

The csv's are small files but about 20.000 would be received daily yes it needs to be SQL storage because we need to aggregate the data easily.

what will be in the csv and needs to be stored??
a unique key value (string)
a consumption value (double)
a datetime stamp (datetime/string)
a quality value (int)

The architecture i had in mind would be:
Http requests to the cloud (does the cloud need a listener service?)
A queue service that holds the csv's before they are processed
The sql drive storage (direct import? or do i need some kind of worker role in between?)
A web service that will get requests from an external AOS or a client application with a query for the data in the sqlDB.

Am i correct in assuming this problem can be solved with standard components or do i need to implement a vm role? How would you set this up?

Any input would be much appreciated because i really feel lost in the clouds :)
I hope I gave a clear overview of the requirements...
It's not easy explaining something you don't fully grasp yourself

Upvotes: 3

Views: 1304

Answers (2)

Voclare
Voclare

Reputation: 251

Is there a reason why you can not just use BCP (Bulk Copy) to import the data directly into SQL Azure? BCP supports CSV files and I suspect you could create a pretty simple process to import the data on a daily basis using this tool. If you do this, make sure you read up on some of the ways that you can optimize the load of the data. This can really make a difference if you have large data sets.

Upvotes: 0

David Makogon
David Makogon

Reputation: 71118

You don't need a VM role at all. Here's a strawman idea:

  • Set up a web service that lets you push your csv files up (easy to do in a Web Role with an svc). Have that service method store each csv in an Azure Blob in some specific container (like "uploads"), with a name like "guid.csv" - just call Guid.NewGuid().ToString() to generate a guid on the fly. Once that's done, create a queue message referencing that file name.
  • In the Run() method of either the same role instance hosting your svc (just override Run() ) or in a separate worker role, set up a while(true) loop to just read from the queue to grab a csv needing import, reading the blob into a memory stream and dumping to a temp file on disk, and then calling a local helper method to parse the csv and call the SQL Insert.
  • Set up another Web Service for retrieving data. Again, this can either be hosted in the same web role or another.

Absolutely no need for VM Role.

Upvotes: 3

Related Questions