Reputation: 9
I am like super new to apache beam and i am trying to read data from a kafka topic and then print on screen.. here is what i am trying but i think i am not sure about the print part
class LogProcessor(beam.DoFn):
def process(self, element):
print(element)
(p
| 'read' >> ReadFromKafka(cluster='mykafkacluster',topic='funny')
| 'print' >> beam.ParDo(LogProcessor())
)
Can someone help me with the LogProcessor() part on how exactly that works ?
Upvotes: 0
Views: 1343
Reputation: 6572
You can use a beam.Map
with a function thats logs then returns each element in the PCollection
:
import logging
def log_element(elem):
logging.info(elem)
return elem
(p
| 'read' >> ReadFromKafka(cluster='mykafkacluster',topic='funny')
| 'print' >> beam.Map(log_element)
)
Then you will see the logs in the Dataflow
UI console or in Cloud Logging
.
Upvotes: 1