Reputation: 1204
I have a method (a .yml parser) that takes an input stream as input. The problem is that it throws errors when it encounters certain characters in certain places e.g. %
.
What I would like to do is take the stream, replace all of the %
with a place holder, and then pass it to the parser.
This is what I have (which doesn't work with the current input):
stream = open('file.yml', 'r')
dict = yaml.safe_load(stream)
But what I think I need is something like:
stream = open('file.yml', 'r')
temp_string = stringFromString(stream) #convert stream to string
temp_string.replace('%', '_PLACEHOLDER_') #replace with place holder
stream = streamFromString(temp_String) #conver back to stream
dict = yaml.safe_load(stream)
Upvotes: 3
Views: 3602
Reputation: 660
I found a solution for this somewhere and keep finding this answer when searching for the original source. So maybe this is useful.
Some random non-comformant YAML file with %
.
a:
b:
- 1
- 2
- 3%
- 4
A replace on read drop-in for open("file.yml", "r")
is:
from yaml import safe_load_all
class ReplacePc():
def __init__(self, filename):
self.fn = filename
self.buffer = ''
def __enter__(self):
self.fh = open(self.fn, 'r')
return self
def __exit__(self, _type, _value, _tb):
self.fh.close()
def read(self, size):
eof = False
while self.fn is not None and not eof and len(self.buffer) < size:
line = self.fh.readline()
if line == '':
eof = True
self.buffer += line.replace('%', '_PC_')
if len(self.buffer) > size:
chunk = self.buffer[:size]
self.buffer = self.buffer[size:]
else:
chunk = self.buffer
self.buffer = ''
return chunk
with ReplacePc('ex/file_with_pc.yaml') as f:
for data in safe_load_all(f):
print(data)
Upvotes: 0
Reputation: 89057
Edit: Apparently the original answer here no longer appears to work, and the library now requires a file-like object.
Given that, it becomes a little more awkward. You could write your own wrapper that acts in a file-like way (the basis for this would probably be io.TextIOBase
) and does the replacement in a buffer, but if you are willing to sacrifice laziness, the easiest solution is roughly what was originally suggested in the question: do the replacement in memory.
The solution for turning a string into a file-like object is io.StringIO
.
Old answer:
A good way of doing this would be to write a generator, that way it remains lazy (the whole file doesn't need to be read in at once):
def replace_iter(iterable, search, replace):
for value in iterable:
value.replace(search, replace)
yield value
with open("file.yml", "r") as file:
iterable = replace_iter(file, "%", "_PLACEHOLDER")
dictionary = yaml.safe_load(iterable)
Note the use of the with
statement to open the file - this is the best way to open files in Python, as it ensures files get closed properly, even when exceptions occur.
Also note that dict
is a poor variable name, as it will smash the built in dict()
and stop you from using it.
Do note that your stringFromStream()
function is essentially file.read()
, and steamFromString()
is data.splitlines()
. What you are calling a 'stream' is actually just an iterator over strings (the lines of the file).
Upvotes: 6