Reputation: 1585
I've resolved this issue, but I'm wondering why it was caused in the first place. I used BeautifulSoup to identify this span from a webpage:
span = <span id="ctl00_ContentPlaceHolder1_RestInfoReskin_lblRestName">Ally's Sizzlers</span>
I then assign this variable:
restaurant.name = span.contents
However on each loop this takes up a full 1 MB, and there's about 20,000 loops. Through trial and error I came upon this solution:
restaurant.name = str(span.contents)
Can you tell me why the former span.contents takes up so much memory?
Upvotes: 1
Views: 1215
Reputation: 990
Old stuff, but just in case other people wonder: span.contents
returns a reference to a NavigableString
instance. There is a link between this instance and the DOM tree, so that as long as this instance is in use, the whole DOM tree cannot be released from memory by the garbage collector. Thus, as long as restaurant.name
is not released from memory, the whole DOM tree is kept in memory.
Using str(span.contents)
returns a string which is not linked with the DOM tree, so it does not prevent the DOM tree from being released from memory.
Upvotes: 1