omerS
omerS

Reputation: 874

How to loop through HTML to get data in order with scrapy?

For example I have a HTML like this:

<div id="des">
    <p>One</p>
    <p>Second</p>
    <img src="firstimage" alt="">
    <p>Third</p>
    <img src="secondimage" alt="">
    <p>Fourth</p>
</div>

I can use this single line to get all p texts as:

des = response.css("#des p::text").getall()

or images as same.

However what I want is like I will have an array of the data (text for p and src for img) as ordered in the HTML page for example:

["one", "second", "firstimage", "third", "secondimage", "fourth"]

I know there is Items which may help me but couldn't figure how to achieve this. Is there a way that I can loop through in div id="des" and get data in ordered way ?

Upvotes: 1

Views: 196

Answers (1)

PacketLoss
PacketLoss

Reputation: 5746

You can utilize two selectors in one query, which will extract in order of occurance.

response.css("#des p::text, #des img::attr(src)").extract()
#['One', 'Second', 'firstimage', 'Third', 'secondimage', 'Fourth']

Upvotes: 1

Related Questions