Apache Beam - Reading JSON and Stream

Question

I am writing Apache beam code, where I have to read a JSON file which has placed in the project folder, and read the data and Stream it.

This is the sample code to read JSON. Is this correct way of doing it?

PipelineOptions options = PipelineOptionsFactory.create();
options.setRunner(SparkRunner.class);

Pipeline p = Pipeline.create(options);

PCollection lines = p.apply("ReadMyFile", TextIO.read().from("/Users/xyz/eclipse-workspace/beam-prototype/test.json"));
System.out.println("lines: " + lines);

or I should use,

p.apply(FileIO.match().filepattern("/Users/xyz/eclipse-workspace/beam-prototype/test.json"))

I just need to read the below json file. Read the complete testdata from this file and then Stream it.

{
“testdata":{
“siteOwner”:”xxx”,
“siteInfo”:{
“siteID”:”id_member",
"siteplatform”:”web”,
"siteType”:”soap”,
"siteURL”:”www”,
}
}
}

The above code is not reading the json file, it is printing like

lines: ReadMyFile/Read.out [PCollection]

, could you please guide me with sample reference?

Apache Beam - Reading JSON and Stream

Answers (1)

Related Questions