Reputation: 122042
Given a string as such:
LexicalReordering0= -1.88359 0 -1.6864 -2.34184 -3.29584 0 Distortion0= -4 LM0= -85.3898 WordPenalty0= -13 PhrasePenalty0= 11 TranslationModel0= -6.79761 -3.06898 -8.90342 -4.35544
It contains the key of the desired dictionary that ends with =
and until the next key, the rest of the values separated by spaces are the values of the current key.
Do note that the name of the keys are not know before parsing the input string
The resulting dictionary should look like this:
{'PhrasePenalty0=': [11.0], 'Distortion0=': [-4.0], 'TranslationModel0=': [-6.79761, -3.06898, -8.90342, -4.35544], 'LM0=': [-85.3898], 'WordPenalty0=': [-13.0], 'LexicalReordering0=': [-1.88359, 0.0, -1.6864, -2.34184, -3.29584, 0.0]}
I could do so with this loop:
>>> textin ="LexicalReordering0= -1.88359 0 -1.6864 -2.34184 -3.29584 0 Distortion0= -4 LM0= -85.3898 WordPenalty0= -13 PhrasePenalty0= 11 TranslationModel0= -6.79761 -3.06898 -8.90342 -4.35544"
>>> thiskey = ""
>>> thismap = {}
>>> for element in textin.split():
... if element[-1] == '=':
... thiskey = element
... thismap[thiskey] = []
... else:
... thismap[thiskey].append(float(element))
...
>>> map
{'PhrasePenalty0=': [11.0], 'Distortion0=': [-4.0], 'TranslationModel0=': [-6.79761, -3.06898, -8.90342, -4.35544], 'LM0=': [-85.3898], 'WordPenalty0=': [-13.0], 'LexicalReordering0=': [-1.88359, 0.0, -1.6864, -2.34184, -3.29584, 0.0]}
But is there another way to achieve the same dictionary from the input string? (maybe regex or some pythonic parser library?).
Upvotes: 2
Views: 4371
Reputation: 241701
Here's a way to do it using the regular expression library. I don't know if it is more efficient, or even if it could be described as pythonic:
pat = re.compile(r'''([^\s=]+)=\s*((?:[^\s=]+(?:\s|$))*)''')
# The values are lists of strings
entries = dict((k, v.split()) for k, v in pat.findall(textin))
# Alternative if you want the values to be floating point numbers
entries = dict((k, list(map(float, v.split())))
for k, v in pat.findall(textin))
In Python 2.x, you can use map(float, v.split())
instead of list(map(float, v.split)))
.
Unlike the original program, this one allows inputs where there is no whitespace between the =
and the first value. Also, any items in the input before the first instance of key=
are silently ignored. It might be better to explicitly recognize them and throw an error.
Explanation of the pattern:
([^\s=]+) A key (any non-whitespace except =)
=\s* followed by = and possible whitespace
((?:[^\s=]+(?:\s|$))*) Any number of repetitions of a string
without = followed by either whitespace
or the end of the input
Upvotes: 3
Reputation: 580
Since your input string is separated by spaces and you either have keys or values, you can use split() and then loop through the elements and assign them.
entries = textin.split()
key = ""
for x in entries:
try:
x = float(x)
answer[key].append(x)
except ValueError:
key = x[:-1] # ignore last char '='
answer[key] = []
I am assuming that the first element of your string will always be a key, so answer[key]
will never get called when key
is an empty string.
Upvotes: 0