Ricko
Ricko

Reputation: 9

Read data from kafka and print in apache beam

I am like super new to apache beam and i am trying to read data from a kafka topic and then print on screen.. here is what i am trying but i think i am not sure about the print part

class LogProcessor(beam.DoFn):
      def process(self, element):
          print(element)


(p
 | 'read' >> ReadFromKafka(cluster='mykafkacluster',topic='funny')
 | 'print' >> beam.ParDo(LogProcessor())
 )

Can someone help me with the LogProcessor() part on how exactly that works ?

Upvotes: 0

Views: 1343

Answers (1)

Mazlum Tosun
Mazlum Tosun

Reputation: 6572

You can use a beam.Map with a function thats logs then returns each element in the PCollection :

import logging

def log_element(elem):
    logging.info(elem)
    return elem
   

(p
 | 'read' >> ReadFromKafka(cluster='mykafkacluster',topic='funny')
 | 'print' >> beam.Map(log_element)
 )

Then you will see the logs in the Dataflow UI console or in Cloud Logging.

Upvotes: 1

Related Questions