Reputation: 941
I am trying to understand the best way of solving the following:
As simple example scenario, I have a file which describes a test name and if its execution passed (true/false).
test-scenario,passed
--------------------
testA,true
testB,false
Using apache beam I can read, parse the file into PCollection<TestDetails>
and then using subsequent transforms write all test details which have passed to one set of files and likewise for those tests which failed.
After writing the above files I would finally like to generate some counts regarding: the total number of file records processed, number tests that passed, number test that failed and write these details to a single file.
Should I use a combine global for this ?
Upvotes: 0
Views: 731
Reputation: 1443
For this purpose, you can use Beam Metrics (please, see the documentation). It provides counters, that can be used for the needs you described above, and then metrics can be fetched once your pipeline is finished. Please, take a look on this example. Also, Beam allows to export metrics into external sink, if it's more convenient.
Upvotes: 1