How can I select all text within an element in scrapy if said element has other elements inside?

Question

I've got a page with sections like this. It is basically a single question within the main p tag, but every time there are certain superscripts, it breaks my code.

The text I want to get is - "For Cosine Rule of any triangle ABC, b2 is equal to"

MCQ.For Cosine Rule of any triangle ABC, b² is equal to
    
        a² - c² + 2ab cos A
        a³ + c³ - 3ab cos A
        a² + c² - 2ac cos B
        a² - c² 4bc cos A

When I try to do a select for the p, I miss out the 2 that are supposed to be super-scripted. Further, I also get two sentences in the list, which messes up a few things when I try to store the answers

 response.css('p::text') > ["For Cosine Rule of any triangle ABC, b", "is equal to"]

I could tried a select using

response.css('p sup::text')

and then try merging it by checking if a sentence ever started with a small letter but that messed up when I had many questions. Here's what I'm doing in my parse method

`
    questions = [x for x in questions if x not in [' ']] #The list I get usually has a bunch of ' ' in them
    question_sup = response.css('p sup::text').extract()
    answer_sup = response.css('li sup::text').extract()
    all_choices = response.css('li::text')[:-2].extract() #for choice
    all_answer = response.css('.dsplyans::text').extract() #for answer

    if len(question_sup) is not 0:
        count=-1
        for question in questions:
            if question[1].isupper() is False or question[0] in [',', '.']: #[1] because there is a space at the starting
                questions[count]+=question_sup.pop(0)+question
                del questions[count+1]

            count+=1

What I tried above fails quite a bunch of times, and I have no idea how I can debug it. I'm crawling quite a lot of pages, and I have no Idea how to debug this. I keep getting a cannot pop empty list error. I guess, that's because something is wrong with what I'm trying above. Any help would be much appreciated!

How can I select all text within an element in scrapy if said element has other elements inside?

Answers (1)

Related Questions