user5520049
user5520049

Reputation:

Open file in an Apache Storm Spout with python

I am trying to make Apache Storm Spout read from a file line by line. I have tried to write these statements, but they didn't work . It gave me the first line only iterated every time:

class SimSpout(storm.Spout):
    # Not much to do here for such a basic spout
    def initialize(self, conf, context):
        ## Open the file with read only permit
        self.f = open('data.txt', 'r')
        ## Read the first line 
        self._conf = conf
        self._context = context
        storm.logInfo("Spout instance starting...")

    # Process the next tuple
    def nextTuple(self):
        # check if it reach at the EOF to close it 
        for line in self.f.readlines():
            # Emit a random sentence
            storm.logInfo("Emiting %s" % line)
            storm.emit([line])

# Start the spout when it's invoked
SimSpout().run()

Upvotes: 0

Views: 1078

Answers (1)

Stephen Rauch
Stephen Rauch

Reputation: 49812

Disclaimer: Since I have no way to test this, this answer will simply be from inspection.

You failed to save the filehandle you opened in initialize(). This edit saves the filehandle and then use the saved filehandle for the read. It also fixes (I hope) some indenting that looked wrong.

class SimSpout(storm.Spout):
    # Not much to do here for such a basic spout
    def initialize(self, conf, context):
        ## Open the file with read only permit
        self.f = open('mydata.txt', 'r')
        self._conf = conf
        self._context = context

        storm.logInfo("Spout instance starting...")

    # Process the next tuple
    def nextTuple(self):
        # check if it reach at the EOF to close it
        for line in self.f.readlines():
            # Emit a random sentence
            storm.logInfo("Emiting %s" % line)
            storm.emit([line])

# Start the spout when it's invoked
SimSpout().run()

Upvotes: 1

Related Questions