Reputation: 784
I have created a dataflow pipeline in java using eclipse, also I have the jar file of my pipeline application kept in google storage.
My requirement is to automate the whole process, As per my understanding this can be done by creating a cron job or by creating a template. Can anyone provide a better understanding about how it can be done ?
EDIT : getting error in StarterPipeline.run();
ArtifactServlet.java
package my.proj;
import java.io.IOException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.annotation.WebServlet;
import javax.servlet.http.HttpServletResponse;
@WebServlet(name = "ArtifactServlet", value = "/home/support/Ad-eff")
public class ArtifactServlet extends HttpServlet {
@Override
public void doGet(HttpServletRequest req, HttpServletResponse resp) throws IOException {
StarterPipeline.run();
}
}
Upvotes: 0
Views: 550
Reputation: 1672
This article is a nice source on how to schedule Dataflow pipelines, either with the App Engine Cron Service or Cloud Functions. It is a bit outdated, as Cloud Functions were in alpha at the time it was published (they are now in beta), but it should still work ok.
App Engine cron job
An App Engine cron job invokes a URL defined as part of your App Engine app via HTTP GET. Due to Dataflow pipeline execution requirements you will need to do what you are looking for in the flex environment. Here are the steps you need to take:
Cloud Functions
With Cloud Functions you write Node.js functions which respond to a number of different events/triggers such as Pub/Sub messages, Cloud Storage changes and HTTP invocations. So, you can write a Cloud Function executing a Dataflow pipeline that can have any of these Cloud Function triggers to kickstart the Dataflow pipeline.
gcloud beta functions deploy myFunction --trigger-resource my-topic --trigger-event google.pubsub.topic.publish
. You can then create a Servlet that publishes an empty message to my-topic
. From that point on, you would have to follow steps 2 and 3 from the App Engine cron job solution description above.Upvotes: 3