foxwendy
foxwendy

Reputation: 2929

Apache Beam exception when running wordcount example

I think I followed very step on the document, but I still ran into this exception. (the only different is that I run this from Eclipse J2EE, but I won't expect this really maters, doesn't it?)

Code: (I didn't write this, it's right from the beam project example). I think you'd have to specify a google cloud platform project and provide the right credential to access it. However, I didn't find anywhere in this example project that does the setting up.

  public static void main(String[] args) {
// Create a PipelineOptions object. This object lets us set various execution
// options for our pipeline, such as the runner you wish to use. This example
// will run with the DirectRunner by default, based on the class path configured
// in its dependencies.
PipelineOptions options = PipelineOptionsFactory.create();

// Create the Pipeline object with the options we defined above.
Pipeline p = Pipeline.create(options);

// Apply the pipeline's transforms.

// Concept #1: Apply a root transform to the pipeline; in this case, TextIO.Read to read a set
// of input text files. TextIO.Read returns a PCollection where each element is one line from
// the input text (a set of Shakespeare's texts).

// This example reads a public data set consisting of the complete works of Shakespeare.
p.apply(TextIO.Read.from("gs://apache-beam-samples/shakespeare/*"))
.....
)

Exception:

Exception in thread "main" java.lang.IllegalStateException: Failed to validate gs://apache-beam-samples/shakespeare/*
at org.apache.beam.sdk.io.TextIO$Read$Bound.expand(TextIO.java:309)
at org.apache.beam.sdk.io.TextIO$Read$Bound.expand(TextIO.java:205)
at org.apache.beam.sdk.runners.PipelineRunner.apply(PipelineRunner.java:76)
at org.apache.beam.runners.direct.DirectRunner.apply(DirectRunner.java:296)
at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:388)
at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:302)
at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:47)
at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:152)
at google.dataflow.beam.example.MinimalWordCount.main(MinimalWordCount.java:77)
Caused by: java.io.IOException: Unable to match files in bucket apache-beam-samples, prefix shakespeare/ against pattern shakespeare/[^/]*
at org.apache.beam.sdk.util.GcsUtil.expand(GcsUtil.java:234)
at org.apache.beam.sdk.util.GcsIOChannelFactory.match(GcsIOChannelFactory.java:53)
at org.apache.beam.sdk.io.TextIO$Read$Bound.expand(TextIO.java:304)
... 8 more
Caused by: com.google.api.client.http.HttpResponseException: 400 Bad Request
{


"error" : "invalid_grant"
}
    at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1070)
    at com.google.auth.oauth2.UserCredentials.refreshAccessToken(UserCredentials.java:207)
    at com.google.auth.oauth2.OAuth2Credentials.refresh(OAuth2Credentials.java:149)
    at com.google.auth.oauth2.OAuth2Credentials.getRequestMetadata(OAuth2Credentials.java:135)
    at com.google.auth.http.HttpCredentialsAdapter.initialize(HttpCredentialsAdapter.java:96)
    at com.google.cloud.hadoop.util.ChainingHttpRequestInitializer.initialize(ChainingHttpRequestInitializer.java:52)
    at com.google.api.client.http.HttpRequestFactory.buildRequest(HttpRequestFactory.java:93)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.buildHttpRequest(AbstractGoogleClientRequest.java:300)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
    at com.google.cloud.hadoop.util.ResilientOperation$AbstractGoogleClientRequestExecutor.call(ResilientOperation.java:166)
    at com.google.cloud.hadoop.util.ResilientOperation.retry(ResilientOperation.java:66)
    at com.google.cloud.hadoop.util.ResilientOperation.retry(ResilientOperation.java:103)
    at org.apache.beam.sdk.util.GcsUtil.expand(GcsUtil.java:227)
    ... 10 more

Upvotes: 2

Views: 1890

Answers (1)

Dpk Goyal
Dpk Goyal

Reputation: 113

Try to run it From command Prompt if using Windows. Go to the folder containing pom.xml file and open cmd there. then give command with the respective arguments.

mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount -Dexec.args=" --output=counts" -Pdirect-runner

If you want to run with your input file. Then make a txt file with any name and put it in the folder containing pom. And then Fire following Command.

mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount -Dexec.args="--inputFile=YOURFILENAME.txt --output=counts" -Pdirect-runner**

Hope this will do. Rest i am looking into your issue

Upvotes: 2

Related Questions