Loop iterating through all rows instead of iterating through each row separately

Question

So here is the table that I am trying to get data from



ETC...
This is the scrapy code that I have so far for parsing
        for sel in response.xpath('//tr'):
        string = " ".join(response.xpath('//th/a/text()').extract()) + ":" + " ".join(response.xpath('//td/text()').extract())
        print string
But this yields a result like this:
Level Components Casting Time Range Effect Duration Saving Throw Spell Resistance:V, S, M, XP 12 hours 0 ft. One duplicate creature Instantaneous None No
When the output should look something like 
Level: CLR 1  Components:V, S, M etc...
Essentially, for some reason it isn't looping through each row of the table and finding the one  and  cell for each and sticking them together, it's finding all of the data from  and all of the data from  and then sticking those two sets together. I assume my for statement needs to be fixed - how do I go about getting it to examine each row individually? 


    
        Level:
    
    
        Clr 3
    


    
        Components:
    
    
        V, S
    


    
        Casting Time:
    
    
        1 standard action

Anand S Kumar · Accepted Answer

When you query an xpath like -

response.xpath('//th/a/text()')

This would return all the elements with elements in them (that have a text() ) . That is not what you want . You should do -

for sel in response.xpath('//tr'):
    string = " ".join(sel.xpath('.//th/a/text()').extract()) + ":" + " ".join(sel.xpath('.//td/text()').extract())
    print string

The dot in the xpath inside the loop, is so that xpath is run relative to the current node, not from the starting node.

More details on relative xpaths at Working with Relative XPaths

Loop iterating through all rows instead of iterating through each row separately

Answers (1)

Related Questions