Hezi Resheff
Hezi Resheff

Reputation: 977

Python support for Azure ML -- speed issue

We are trying to create an Azure ML web-service that will receive a (.csv) data file, do some processing, and return two similar files. The Python support recently added to the azure ML platform was very helpful and we were able to successfully port our code, run it in experiment mode and publish the web-service.

Using the "batch processing" API, we are now able to direct a file from blob-storage to the service and get the desired output. However, run-time for small files (a few KB) is significantly slower than on a local machine, and more importantly, the process seems to never return for slightly larger input data files (40MB). Processing time on my local machine for the same file is under 1 minute.

My question is if you can see anything we are doing wrong, or if there is a way to get this to speed up. Here is the DAG representation of the experiment:

The DAG representation of the experiment

Is this the way the experiment should be set up?

Upvotes: 0

Views: 532

Answers (1)

Hezi Resheff
Hezi Resheff

Reputation: 977

It looks like the problem was with processing of a timestamp column in the input table. The successful workaround was to explicitly force the column to be processed as string values, using the "Metadata Editor" block. The final model now looks like this:

final model

Upvotes: 1

Related Questions