Ivan
Ivan

Reputation: 64217

How to preprocess a text stream on the fly in Python?

What I need is a Python 3 function (or whatever) that would take a text stream (like sys.stdin or like that returned by open(file_name, "rt")) and return a text stream to be consumed by some other function but remove all the spaces, replace all tabs with commas and convert all the letters to lowercase on the fly (the "lazy" way) as the data is read by the consumer code.

I assume there is a reasonably easy way to do this in Python 3 like something similar to list comprehensions but don't know what exactly might it be so far.

Upvotes: 6

Views: 549

Answers (2)

pstatix
pstatix

Reputation: 3848

I believe what you are looking for is the io module, more specifically a io.StringIO.

You can then use the open() method to get the initial data and modify, then pass it around:

with open(file_name, 'rt') as f:
    stream = io.StringIO(f.read().replace(' ','').replace('\t',',').lower())

Upvotes: 0

Gal Bashan
Gal Bashan

Reputation: 112

I am not sure this is what you mean, but the easiest way i can think of is to inherit from file (the type returned from open) and override the read method to do all the things you want after reading the data. A simple implementation would be:

class MyFile(file):
    def read(*args, **kwargs):
         data = super().read(*args,**kwargs)
         # process data eg. data.replace(' ',' ').replace('\t', ',').lower()
         return data

Upvotes: 1

Related Questions