SUDHIR GARG
SUDHIR GARG

Reputation: 75

fetching Cloud Data Fusion Runtime info

I want to pass the runid of Data fusion pipeline to some function upon pipeline completion but i am not able to find any run-time variable which holds this value. Please help!

Upvotes: 2

Views: 1045

Answers (2)

FVCC
FVCC

Reputation: 317

As an update to the previous answer, the first thing to do is to obtain the details of the deployed pipelines in a given namespace. For this, the following endpoint should be queried: '/v3/namespaces/${NAMESPACE}/apps'. Where ${NAMESPACE} is the namespace where the pipeline is deployed.

This endpoint returns a list with the pipelines deployed on this namespace ${NAMESPACE} (not the pipeline JSON, just a high level description list). Once the pipeline list is obtained, to obtain the run metrics of a given pipeline, the following endpoint should be called: '/v3/namespaces/${NAMESPACE}/apps/${PIPELINE}/workflows/DataPipelineWorkflow/runs', where ${PIPELINE} is the name of the pipeline. This endpoint will return the details of all the runs for this pipeline. This is where the run_id can be obtained. The field containing the run_id is actually called runid in this list.

With the run_id, you can then obtain all the run logs for example by querying the endpoint '{CDAP_ENDPOINT}/v3/namespaces/{NAMESPACE}/apps/{PIPELINE}/workflows/DataPipelineWorkflow/runs/{run["runid"]}/logs?start={run["start"]}&stop={run["start"]}'. The previous snippet is a python snippet where run is a dictionary containing the run details of a particular run.

As explained in the CDAP microservice guide, to call these endpoints, the CDAP endpoint must be obtained by running the command: gcloud beta data-fusion instances describe --project=${PROJECT} --location=${REGION} --format="value(apiEndpoint)" ${INSTANCE_ID}. The authentication token will also be needed and this can be found through running: gcloud auth print-access-token.

Upvotes: 1

aga
aga

Reputation: 3883

The correct answer has been provided by @Edwin Elia in the comment section:

Retrieving the run-id of a Data Fusion pipeline within its run or the predecessor pipeline's is not possible currently. Here is an enhancement that you can track that would make it possible.

When talking about retrieving the run_id value after pipeline completion you should be able to use the REST API from the CDAP documentation to get information on the run including the run-id.

Upvotes: 0

Related Questions