Cutting a part of string variable in python (web scraping)

Question

Im trying to scrape a website, so I managed to extract all the text that I wanted, using this template:

nameList = bsObj.findAll("strong")
for text in nameList:
    string = text.get_text()
    if "Title" in string:
        print(text.get_text())

And I get the texts in this fashion:

Title 1: textthatineed

Title 2: textthatineed

Title 3: textthatineed

Title 4: textthatineed

Title 5: textthatineed

Title 6: textthatineed

Title 7: textthatineed ....

Is there any way that I can cut the string in python using beautifulsoup or any other way, and get only the "textthatineed" without "title(number): ".

ren · Accepted Answer

Say we have

s = 'Title 1: textthatineed'

The title starts two characters after the colon, so we find the colon's index, move two characters down, and take the substring from that index to the end:

index = s.find(':') + 2
title = s[index:]

Note that find() only returns the index of the first occurrence, so titles containing colons are unaffected.

Cutting a part of string variable in python (web scraping)

Answers (2)

Related Questions