Reputation: 783
I've got a function which parses a sentence by building up a chart. But Python holds on to whatever memory was allocated during that function call. That is, I do
best = translate(sentence, grammar)
and somehow my memory goes up and stays up. Here is the function:
from string import join
from heapq import nsmallest, heappush
from collections import defaultdict
MAX_TRANSLATIONS=4 # or choose something else
def translate(f, g):
words = f.split()
chart = {}
for col in range(len(words)):
for row in reversed(range(0,col+1)):
# get rules for this subspan
rules = g[join(words[row:col+1], ' ')]
# ensure there's at least one rule on the diagonal
if not rules and row==col:
rules=[(0.0, join(words[row:col+1]))]
# pick up rules below & to the left
for k in range(row,col):
if (row,k) and (k+1,col) in chart:
for (w1, e1) in chart[row, k]:
for (w2, e2) in chart[k+1,col]:
heappush(rules, (w1+w2, e1+' '+e2))
# add all rules to chart
chart[row,col] = nsmallest(MAX_TRANSLATIONS, rules)
(w, best) = chart[0, len(words)-1][0]
return best
g = defaultdict(list)
g['cela'] = [(8.28, 'this'), (11.21, 'it'), (11.57, 'that'), (15.26, 'this ,')]
g['est'] = [(2.69, 'is'), (10.21, 'is ,'), (11.15, 'has'), (11.28, ', is')]
g['difficile'] = [(2.01, 'difficult'), (10.08, 'hard'), (10.19, 'difficult ,'), (10.57, 'a difficult')]
sentence = "cela est difficile"
best = translate(sentence, g)
I'm using Python 2.7 on OS X.
Upvotes: 3
Views: 1553
Reputation: 77400
Within the function, you set rules
to an element of grammar
; rules
then refers to that element, which is a list. You then add items to rules
with heappush
, which (as lists are mutable) means grammar
holds on to the pushed values via that list. If you don't want this to happen, use copy
when assigning rules
or deepcopy
on the grammar at the start of translate
. Note that even if you copy the list to rules
, the grammar will record an empty list every time you retrieve an element for a missing key.
Upvotes: 1