Tomas Jansson
Tomas Jansson

Reputation: 23472

How do I do a simple HTTP request against the dataflow API on gcloud with node?

I want to monitor my dataflow jobs with an application. The application I'm developing is a nodejs application and ideally it would exist a package like @google-cloud/bigquery but for dataflow instead. I'm fully aware that I might not be able to start job, if it is not a template job, but it should be an easy way to list jobs or get job information.

Update:

I found this spec, https://dataflow.googleapis.com/$discovery/rest?version=v1b3, but I don't understand what location is for the list operation. The spec was linked from this page: https://cloud.google.com/dataflow/docs/reference/rest/

Upvotes: 0

Views: 942

Answers (2)

adrice727
adrice727

Reputation: 1492

For posterity . . . there is a way to do this without a client library, but it requires generating a jwt from service account credentials and exchanging the jwt for an access token to execute a Dataflow template. This example uses the Cloud_Bigtable_to_GCS_Avro template:

import axios from "axios";
import jwt from "jsonwebtoken";
import mem from "mem";

const loadCredentials = mem(function() {
  // This is a string containing service account credentials
  const serviceAccountJson = process.env.GOOGLE_APPLICATION_CREDENTIALS;
  if (!serviceAccountJson) {
    throw new Error("Missing GCP Credentials");
  }

  const credentials = JSON.parse(serviceAccountJson.replace(/\n/g, "\\n").replace(/\r/g, "\\r").replace(/\t/g, "\\t"));

  return {
    projectId: credentials.project_id,
    privateKeyId: credentials.private_key_id,
    privateKey: credentials.private_key,
    clientEmail: credentials.client_email,
  };
});

interface ProjectCredentials {
  projectId: string;
  privateKeyId: string;
  privateKey: string;
  clientEmail: string;
}

function generateJWT(params: ProjectCredentials) {
  const scope = "https://www.googleapis.com/auth/cloud-platform";
  const authUrl = "https://www.googleapis.com/oauth2/v4/token";
  const issued = new Date().getTime() / 1000;
  const expires = issued + 60;

  const payload = {
    iss: params.clientEmail,
    sub: params.clientEmail,
    aud: authUrl,
    iat: issued,
    exp: expires,
    scope: scope,
  };

  const options = {
    keyid: params.privateKeyId,
    algorithm: "RS256",
  };

  return jwt.sign(payload, params.privateKey, options);
}

async function getAccessToken(credentials: ProjectCredentials): Promise<string> {
  const jwt = generateJWT(credentials);
  const authUrl = "https://www.googleapis.com/oauth2/v4/token";
  const params = {
    grant_type: "urn:ietf:params:oauth:grant-type:jwt-bearer",
    assertion: jwt,
  };
  try {
    const response = await axios.post(authUrl, params);
    return response.data.access_token;
  } catch (error) {
    console.error("Failed to get access token", error);
    throw error;
  }
}

function buildTemplateParams(projectId: string, table: string) {
  return {
    jobName: `[job-name]`,
    parameters: {
      bigtableProjectId: projectId,
      bigtableInstanceId: "[table-instance]",
      bigtableTableId: table,
      outputDirectory: `[gs://your-instance]`,
      filenamePrefix: `${table}-`,
    },
    environment: {
      zone: "us-west1-a" // omit or define your own,
      tempLocation: `[gs://your-instance/temp]`,
    },
  };
}

async function backupTable(table: string) {
  console.info(`Executing backup template for table=${table}`);
  const credentials = loadCredentials();
  const { projectId } = credentials;
  const accessToken = await getAccessToken(credentials);
  const baseUrl = "https://dataflow.googleapis.com/v1b3/projects";
  const templatePath = "gs://dataflow-templates/latest/Cloud_Bigtable_to_GCS_Avro";
  const url = `${baseUrl}/${projectId}/templates:launch?gcsPath=${templatePath}`;
  const template = buildTemplateParams(projectId, table);
  try {
    const response = await axios.post(url, template, {
      headers: { Authorization: `Bearer ${accessToken}` },
    });
    console.log("GCP Response", response.data);
  } catch (error) {
    console.error(`Failed to execute template for ${table}`, error.message);
  }
}

async function run() {
  await backupTable("my-table");
}

try {
  run();
} catch (err) {
  process.exit(1);
}

Upvotes: 1

Tomas Jansson
Tomas Jansson

Reputation: 23472

I did find the solution myself. There is a repo that basically has all the APIs for gcloud out there: https://github.com/google/google-api-nodejs-client

After I found that I could easily do what I wanted:

'use strict';

var google = require('googleapis');
var dataflow = google.dataflow('v1b3');

google.auth.getApplicationDefault(function (err, authClient, projectId) {
    if (err) {
        throw err;
    }

    // The createScopedRequired method returns true when running on GAE or a local developer
    // machine. In that case, the desired scopes must be passed in manually. When the code is
    // running in GCE or a Managed VM, the scopes are pulled from the GCE metadata server.
    // See https://cloud.google.com/compute/docs/authentication for more information.
    if (authClient.createScopedRequired && authClient.createScopedRequired()) {
        // Scopes can be specified either as an array or as a single, space-delimited string.
        authClient = authClient.createScoped([
            'https://www.googleapis.com/auth/compute'
        ]);
    }

    // Fetch the list of GCE zones within a project.
    // NOTE: You must fill in your valid project ID before running this sample!
    var compute = google.compute({
        version: 'v1',
        auth: authClient
    });

    var result = dataflow.projects.jobs.list({
        'projectId': projectId,
        'auth': authClient
    }, function (err, result) {
        console.log(err, result);
    });
});

Upvotes: 2

Related Questions