Paolo Gervasoni Vila
Paolo Gervasoni Vila

Reputation: 307

Why iteration over bs4.element.ResultSet makes no copy of original?

I am a bit confused by the behaviour of iteration on Beautifulsoup ResultSets. Generally speaking, in python I would expect iteration to generate a copy of each element. A list cannot be modified by assigning new values to iterated elements.

l1 = [1,2,3]
for elem in l1:
    elem = elem + 10

will not modify original list l1

But if I do:

from bs4 import BeautifulSoup
soup = BeautifulSoup(html_doc, 'html.parser')

for elem in soup('body'):
    elem.unwrap()

Then the original soup element is modified!

This seems inconsistent to me, but it is clear I am missing some basic stuff here

I am using python 3.5 and BS 4.4.1

Thanks in advance

Upvotes: 1

Views: 678

Answers (1)

Paolo Gervasoni Vila
Paolo Gervasoni Vila

Reputation: 307

Ok, it was a beginner's mistake. When iterating over the soup elements, we are iterating over mutable elements so a modification of a component of a copy is still reflected onto the original variable

l1 = [[1,1],[2,2]]
for elem in l1:
    elem[1] = 5555

would deliver

l1 = [[1,5555],[2,5555]]

Same behaviour for the iteration on bs tags

Upvotes: 1

Related Questions