Reputation:
I am trying to make Apache Storm Spout read from a file line by line. I have tried to write these statements, but they didn't work . It gave me the first line only iterated every time:
class SimSpout(storm.Spout):
# Not much to do here for such a basic spout
def initialize(self, conf, context):
## Open the file with read only permit
self.f = open('data.txt', 'r')
## Read the first line
self._conf = conf
self._context = context
storm.logInfo("Spout instance starting...")
# Process the next tuple
def nextTuple(self):
# check if it reach at the EOF to close it
for line in self.f.readlines():
# Emit a random sentence
storm.logInfo("Emiting %s" % line)
storm.emit([line])
# Start the spout when it's invoked
SimSpout().run()
Upvotes: 0
Views: 1078
Reputation: 49812
Disclaimer: Since I have no way to test this, this answer will simply be from inspection.
You failed to save the filehandle you opened in initialize()
. This edit saves the filehandle and then use the saved filehandle for the read. It also fixes (I hope) some indenting that looked wrong.
class SimSpout(storm.Spout):
# Not much to do here for such a basic spout
def initialize(self, conf, context):
## Open the file with read only permit
self.f = open('mydata.txt', 'r')
self._conf = conf
self._context = context
storm.logInfo("Spout instance starting...")
# Process the next tuple
def nextTuple(self):
# check if it reach at the EOF to close it
for line in self.f.readlines():
# Emit a random sentence
storm.logInfo("Emiting %s" % line)
storm.emit([line])
# Start the spout when it's invoked
SimSpout().run()
Upvotes: 1