BloonsTowerDefence
BloonsTowerDefence

Reputation: 1204

Replace certain characters in stream

I have a method (a .yml parser) that takes an input stream as input. The problem is that it throws errors when it encounters certain characters in certain places e.g. %.

What I would like to do is take the stream, replace all of the % with a place holder, and then pass it to the parser.

This is what I have (which doesn't work with the current input):

    stream = open('file.yml', 'r')
    dict = yaml.safe_load(stream)

But what I think I need is something like:

    stream = open('file.yml', 'r')
    temp_string = stringFromString(stream)     #convert stream to string
    temp_string.replace('%', '_PLACEHOLDER_')  #replace with place holder
    stream = streamFromString(temp_String)     #conver back to stream
    dict = yaml.safe_load(stream)

Upvotes: 3

Views: 3602

Answers (2)

jwal
jwal

Reputation: 660

I found a solution for this somewhere and keep finding this answer when searching for the original source. So maybe this is useful.

Some random non-comformant YAML file with %.

a:
  b:
  - 1
  - 2
  - 3%
  - 4

A replace on read drop-in for open("file.yml", "r") is:

from yaml import safe_load_all

class ReplacePc():
    def __init__(self, filename):
        self.fn = filename
        self.buffer = ''

    def __enter__(self):
        self.fh = open(self.fn, 'r')
        return self

    def __exit__(self, _type, _value, _tb):
        self.fh.close()

    def read(self, size):
        eof = False
        while self.fn is not None and not eof and len(self.buffer) < size:
            line = self.fh.readline()
            if line == '':
                eof = True
            self.buffer += line.replace('%', '_PC_')
        if len(self.buffer) > size:
            chunk = self.buffer[:size]
            self.buffer = self.buffer[size:]
        else:
            chunk = self.buffer
            self.buffer = ''
        return chunk

with ReplacePc('ex/file_with_pc.yaml') as f:
    for data in safe_load_all(f):
        print(data)

Upvotes: 0

Gareth Latty
Gareth Latty

Reputation: 89057

Edit: Apparently the original answer here no longer appears to work, and the library now requires a file-like object.

Given that, it becomes a little more awkward. You could write your own wrapper that acts in a file-like way (the basis for this would probably be io.TextIOBase) and does the replacement in a buffer, but if you are willing to sacrifice laziness, the easiest solution is roughly what was originally suggested in the question: do the replacement in memory.

The solution for turning a string into a file-like object is io.StringIO.


Old answer:

A good way of doing this would be to write a generator, that way it remains lazy (the whole file doesn't need to be read in at once):

def replace_iter(iterable, search, replace):
    for value in iterable:
        value.replace(search, replace)
        yield value

with open("file.yml", "r") as file:
    iterable = replace_iter(file, "%", "_PLACEHOLDER")
    dictionary = yaml.safe_load(iterable)

Note the use of the with statement to open the file - this is the best way to open files in Python, as it ensures files get closed properly, even when exceptions occur.

Also note that dict is a poor variable name, as it will smash the built in dict() and stop you from using it.

Do note that your stringFromStream() function is essentially file.read(), and steamFromString() is data.splitlines(). What you are calling a 'stream' is actually just an iterator over strings (the lines of the file).

Upvotes: 6

Related Questions