Reputation: 66
I have recently setup an azure machine learning experiment to retrain, update and execute on a daily basis using azure data factory following the example documents
and my pipeline is setup similar to below
{
"name": "RetrainAndExecutePipeline",
"properties": {
"activities": [{
"type": "AzureMLBatchExecution",
"typeProperties": {
"webServiceOutputs": {
"Output-TrainedModel": "TrainedModel"
},
"webServiceInputs": {},
"globalParameters": {}
},
"outputs": [{
"name": "TrainedModel"
}
],
"policy": {
"timeout": "01:00:00",
"concurrency": 1,
"executionPriorityOrder": "NewestFirst",
"retry": 3
},
"scheduler": {
"frequency": "Day",
"interval": 1,
"offset": "22:00:00",
"style": "StartOfInterval"
},
"name": "Retrain ML Model",
"linkedServiceName": "TrainingService"
}],
"start": "2017-08-20T22:00:00Z",
"end": "9999-09-09T00:00:00Z",
"isPaused": false,
"hubName": "autdatafactoryml_hub",
"pipelineMode": "Scheduled"
}
}
and the TrainedModel dataset below
{
"name": "TrainedModel",
"properties": {
"published": false,
"type": "AzureBlob",
"linkedServiceName": "AzureStorageLinkedService",
"typeProperties": {
"fileName": "trainedModel.ilearner",
"folderPath": "trainingoutput",
"format": {
"type": "TextFormat"
}
},
"availability": {
"frequency": "Day",
"interval": 1,
"offset": "22:00:00",
"style": "StartOfInterval"
}
}
}
I have noticed that after a training is completed, the outputs that i get into the azure blob storage from the web service output connected to the "Train Model" node are the ilearner file and two randomly named files with no extensions even though I haven't specified them. one xml formated file with contents
<?xml version="1.0" encoding="utf-8"?>
<RuntimeInfo>
<Language>DotNet</Language>
<Version>4.5.0</Version>
</RuntimeInfo>
and the other with the information that you can see when you visualize the output within the azure ML experiment formatted as json as below
{
"visualizationType": "learner",
"learner": {
"name": "LogisticRegressionClassifier",
"isTrained": true,
"settings": {
"records": [
...
],
"features": [
{
"name": "Setting",
"index": 0,
"elementType": "System.String",
"featureType": "String Feature"
},
{
"name": "Value",
"index": 1,
"elementType": "System.String",
"featureType": "String Feature"
}
],
"name": null,
"numberOfRows": 8,
"numberOfColumns": 2
},
"weights": {
"records": [
...
],
"features": [
{
"name": "Feature",
"index": 0,
"elementType": "System.String",
"featureType": "String Feature"
},
{
"name": "Weight",
"index": 1,
"elementType": "System.Double",
"featureType": "Numeric Feature"
}
],
"name": null,
"numberOfRows": 92,
"numberOfColumns": 2
}
}
}
This json file is the one that I am interested in as I presume this is the data that shows the co-efficient values and I am wanting to track how individual co-efficient values change as I update the training model, and I have not been able to find a way to capture this output.
My question is, is there a way to capture multiple outputs from a single Web service output in an azure ML experiment using azure data factory? Or is there a completely different way for me to resolve this?
I appreciate everyones' feedback and thank you in advance
Upvotes: 0
Views: 708
Reputation: 66
In Azure ML Studio, you can create a web service that has multiple outputs by attaching multiple Web Service Output modules. The outputs from these modules will be returned in JSON format when the web service is called. You can also use multiple Export Data modules to write multiple results to Azure storage for example.
Upvotes: 1