Reputation: 305
My team at Moloco runs a lot of Dataflow pipelines (hourly and daily, mostly batch jobs), and from time to time we wish to calculate each pipeline's total cost to identify what improvements we can make to save costs. In the past few weeks, one of our engineers usually goes to the job monitoring UI webpage (via https://console.cloud.google.com/dataflow?project=$project-name), and manually calculates the cost by looking up the number of workers, worker machine type, total PD and memory used, etc.
Recently, we noticed that now the page shows the "resource metrics" which will help us save our time when it comes to calculating the costs (along with the new pricing model that was announced a while ago).
On the other hand, because we run about 60-80 dataflow jobs every day, it is time consuming for us to calculate the cost per job. Is there a way to obtain total vCPU, Memory, and PD/SSD usage metrics via API given a job id, perhaps via ''PipelineResult'' or from the log of the master node? If it is not supported now, do you guys plan to in near future? We are wondering if we should consider writing our own script or something that would extract the metrics per job id, and calculate the costs, but we'd prefer we don't have to do that.
Thanks!
Upvotes: 4
Views: 944
Reputation: 1725
I'm one of the engineers on the dataflow team.
I’d recommend using the command line tool to list these metrics and writing a script to parse the metrics from the output string and calculate your cost based on those. If you want to do this for many jobs, you may want to also list your jobs as well using gcloud beta dataflow jobs list. We are working on solutions to make this easier to obtain in the future.
Make sure you are using gcloud 135.0.0+:
gcloud version
If not you can update it using:
gcloud components update
Login with an account that has access to the project running your job:
cloud auth login
Set your project
gcloud config set project <my_project_name>
Run this command to list the metrics and grep the resource metrics:
gcloud beta dataflow metrics list <job_id> --project=<my_project_name> | grep Service -B 1 -A 3
Your results should be structured like so:
name:
name: Service-mem_mb_seconds
origin: dataflow/v1b3
scalar: 192001
updateTime: '2016-11-07T21:23:46.452Z'
--
name:
name: Service-pd_ssd_gb_seconds
origin: dataflow/v1b3
scalar: 0
updateTime: '2016-11-07T21:23:46.452Z'
--
name:
name: Service-cpu_num
origin: dataflow/v1b3
scalar: 0
updateTime: '2016-11-07T21:23:46.452Z'
--
name:
name: Service-pd_gb
origin: dataflow/v1b3
scalar: 0
updateTime: '2016-11-07T21:23:46.452Z'
--
name:
name: Service-pd_gb_seconds
origin: dataflow/v1b3
scalar: 12500
updateTime: '2016-11-07T21:23:46.452Z'
--
name:
name: Service-cpu_num_seconds
origin: dataflow/v1b3
scalar: 50
updateTime: '2016-11-07T21:23:46.452Z'
--
name:
name: Service-pd_ssd_gb
origin: dataflow/v1b3
scalar: 0
updateTime: '2016-11-07T21:23:46.452Z'
--
name:
name: Service-mem_mb
origin: dataflow/v1b3
scalar: 0
updateTime: '2016-11-07T21:23:46.452Z'
The relevant ones for you are:
Note: These metric names will change in the future soon, to:
Upvotes: 9