user2662833
user2662833

Reputation: 1073

Preserving BeautifulSoup selection order

If I have a simple document like:

<p> hi </p>
<q> hello </q>
<p> bye </p>
<q> try </q>
<p> why </p>

And I store it in a BeautifulSoup object called doc, calling:

> doc.select('p, q')
[<p> hi </p>, <p> bye </p>, <p> why </p>, <q> hello </q>, <q> try </q>]

Is it possible to get these elements in the correct order? I would like to number these tags so that "hi" gets 1, "hello" gets 2 and so on... This is a minimal example, but in practice I will have to select by class, id and tag name.

Upvotes: 4

Views: 903

Answers (2)

DYZ
DYZ

Reputation: 57105

How about soup.findAll(['p','q']):

[<p> hi </p>, <q> hello </q>, <p> bye </p>, <q> try </q>, <p> why </p>]

Upvotes: 0

sytech
sytech

Reputation: 41119

You can always use your own custom finding functions if the builtin methods don't suit your use case.

def my_tag(tag):
    if tag.name in ('p', 'q'):
        return True

soup.find_all(my_tag)

The result would be

 [<p> hi </p>, <q> hello </q>, <p> bye </p>, <q> try </q>, <p> why </p>]

Upvotes: 2

Related Questions