Reputation: 824
Sorry for long post, I need to explain it properly for people to undertsand.
I have a pipeline in datafctory that triggers a published AML endpoint:
I am trying to parametrize this ADF pipeline so that I can deploy to test and prod, but on test and prod the aml endpoints are different.
Therefore, I have tried to edit the parameter configuration in ADF as shows here:
Here in the section Microsoft.DataFactory/factories/pipelines
I add "*":"="
so that all the pipeline parameters are parametrized:
"Microsoft.DataFactory/factories/pipelines": {
"*": "="
}
After this I export the template to see which parameters are there in json, there are lot of them but I do not see any paramter that has aml endpoint name as value, but I see the endpint ID is parametrized.
My question is: Is it possible to parametrize the AML endpoint by name? So that, when deploying ADF to test I can just provide the AML endpoint name and it can pick the id automatically:
Upvotes: 1
Views: 725
Reputation: 34
Making changes to ADF(ARMTemplateForFactory.json) or Synapse(TemplateForWorkspace.json) inside DevOps CI/CD pipeline
Sometimes parameters are not automatically added to parameter file i.e ARMTemplateParametersForFactory.json/TemplateParametersForWorkspace.json, for example MLPipelineEndpointId. In case of ML pipeline you can use PipelineId as parameter ,but can change every time ML pipeline is updated.
You can solve this issue by replacing the value in ADF(ARMTemplateForFactory.json) or Synapse(TemplateForWorkspace.json), using Azure Powershell. Idea is simple, you use powershell to open the ArmTemplate and replace the value based upon the env and it works exactly like overwriting parameters within DevOps.
This editing is done on the fly i.e the devOps artifact is updated and not the repo file, the ADF/Synapse repository won't change..just like how it's done while over writting parameters.
Issue We currently have two environments for Synapse called bla-bla-dev and bla-bla-test. Now dev synapse environment is using dev machine learning environment and test synapse environment is using test ML environment. But the MLPipelineEndpointId is grayed out on dev synapse and the parameter is not present in parameter file so it can't be overwritten normally.
Solution Use Azure Powershell to run below command:-
(Get-Content $(System.DefaultWorkingDirectory)/Artifacts_source/bla-bla-dev/TemplateForWorkspace.json).Replace($(scoringMLPipelineEndPointDev), $(scoringMLPipelineEndPoint)) | Set-Content $(System.DefaultWorkingDirectory)/Artifacts_source/bla-bla-dev/TemplateForWorkspace.json
Steps
Add Azure Powershell step in ADF/Synapse release devOps pipeline. This CI/CD has to be placed before arm template deployment step.
(Get-Content $(System.DefaultWorkingDirectory)/Artifacts_source/bla-bla-dev/TemplateForWorkspace.json).Replace($(scoringMLPipelineEndPointDev), $(scoringMLPipelineEndPoint)) | Set-Content $(System.DefaultWorkingDirectory)/Artifacts_source/bla-bla-dev/TemplateForWorkspace.json
enter image description here Once deployment you will see that you test environment is pointing to test MLpipelineEndpoinId.
Upvotes: 0
Reputation: 181
I finally fixed this.
The trick is to not chose Pipeline Endpoint ID but to choose Pipeline ID.
Pipeline ID can be parametrized and I have set up this to come from a global parameter. Therefore I do not need to find the right level of identation everytime
Then:
Later you add the global parameters to your ARM template:
And in the parameter template you add:
"Microsoft.DataFactory/factories": {
"properties": {
"globalParameters": {
"*": {
"value": "="
}
},
"globalConfigurations": {
"*": "="
},
"encryption": {
"*": "=",
"identity": {
"*": "="
}
}
}
"Microsoft.DataFactory/factories/globalparameters": {
"properties": {
"*": {
"value": "="
}
}
}
Finally I wrote a python CLI tool to get the latest pipeline ID for a given published pipeline id:
import argparse
from azureml.pipeline.core import PipelineEndpoint, PublishedPipeline, Pipeline
from azureml.core import Workspace
from env_variables import Env
from manage_workspace import get_workspace
def get_latest_published_endpoint(ws : Workspace, pipeline_name : str) -> str:
"""
Get the latest published endpoint given a machine learning pipeline name.
The function is used to update the pipeline id in ADF deploy pipeline
Parameters
------
ws : azureml.core.Workspace
A workspace object to use to search for the models
pipeline_name : str
A string containing the pipeline name to retrieve the latest version
Returns
-------
pipeline_name : azureml.pipeline.core.PipelineEndpoint
The pipeline name to retrieve the last version
"""
pipeline_endpoint = PipelineEndpoint.get(workspace=ws, name=pipeline_name)
endpoint_id = pipeline_endpoint.get_pipeline().id # this gives back the pipeline id
# pipeline_endpoint.id gives back the pipeline endpoint id which can not be set
# as dynamic parameter in ADF in an easy way
return endpoint_id
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--monitoring_pipeline_name", type=str,
help="Pipeline Name to get endpoint id",
default='yourmonitoringpipeline')
parser.add_argument("--training_pipeline_name", type=str,
help="Pipeline Name to get endpoint id",
default='yourtrainingpipeline')
parser.add_argument("--scoring_pipeline_name", type=str,
help="Pipeline Name to get endpoint id",
default='yourscoringpipeline')
args, _ = parser.parse_known_args()
e = Env()
ws = get_workspace(e.workspace_name, e.subscription_id, e.resource_group) # type: ignore
latest_monitoring_endpoint = get_latest_published_endpoint(ws, pipeline_name=args.monitoring_pipeline_name) # type: ignore
latest_training_endpoint = get_latest_published_endpoint(ws, pipeline_name=args.training_pipeline_name) # type: ignore
latest_scoring_endpoint = get_latest_published_endpoint(ws, pipeline_name=args.scoring_pipeline_name) # type: ignore
print('##vso[task.setvariable variable=MONITORING_PIPELINE_ID;]%s' % (latest_monitoring_endpoint))
print('##vso[task.setvariable variable=TRAINING_PIPELINE_ID;]%s' % (latest_training_endpoint))
print('##vso[task.setvariable variable=SCORING_PIPELINE_ID;]%s' % (latest_scoring_endpoint))
By printing the variables in these way they are added to environment variables that later I can pick in the ARM deploy step:
And then we have our desired setup:
Different pipeline IDs for different environments.
Maybe material for a blog post as it works like charm.
Upvotes: 1
Reputation: 1
i faced the similar issue when deploying adf pipelines with ml between environments. Unfortunately, As of now, adf parameter file do not have ml pipeline name as parameter value. only turn around solution is modifiying the parameter file(json) file with aligns with your pipeline design. For example, i am triggering ml pipeline endpoint inside foreach activity-->if condition-->ml pipeline
Here is my parameter file values:
"Microsoft.DataFactory/factories/pipelines": {
"properties": {
"activities": [
{
"typeProperties": {
"mlPipelineEndpointId": "=",
"url": {
"value": "="
},
"ifFalseActivities": [
{
"typeProperties": {
"mlPipelineEndpointId": "="
}
}
],
"ifTrueActivities": [
{
"typeProperties": {
"mlPipelineEndpointId": "="
}
}
],
"activities": [
{
"typeProperties": {
"mlPipelineEndpointId": "=",
"ifFalseActivities": [
{
"typeProperties": {
"mlPipelineEndpointId": "=",
"url": "="
}
}
],
"ifTrueActivities": [
{
"typeProperties": {
"mlPipelineEndpointId": "=",
"url": "="
}
}
]
}
}
]
}
}
]
}
}
after you export the ARM template, the json file has records for your ml endpoints
"ADFPIPELINE_NAME_properties_1_typeProperties_1_typeProperties_0_typeProperties_mlPipelineEndpointId": {
"value": "445xxxxx-xxxx-xxxxx-xxxxx"
it is lot of manual effort to maintain if design is frequently changing so far worked for me. Hope this answers your question.
Upvotes: 0