Reputation: 813
I am working on a basic horoscope parser from a website. Below is my code:
import requests
from bs4 import BeautifulSoup as bs
url = "https://www.astrospeak.com/horoscope/capricorn"
response = requests.request("GET", url)
soup = bs(response.text, 'html.parser')
locater = soup.select("#sunsignPredictionDiv > div.fullDIV > div.lineHght18 > div")
quote = locater[0].previousSibling
This leaves me with the following <class 'bs4.element.NavigableString'>
:
"\n You are working towards yet another dream and as you pursue this vision there's no doubt in your mind that it will come to fruition. It's written in the stars! \n "
I am struggling how I can use the BeautifulSoup stripped_strings
generator on the bs4.element.NavigableString. What I would like to end up with is just the string You are working towards yet another dream and as you pursue this vision there's no doubt in your mind that it will come to fruition. It's written in the stars!
Upvotes: 0
Views: 2772
Reputation: 5434
I know the answer in the comment pretty much solves your problem, but I hope to give you some background:
import requests
from bs4 import BeautifulSoup as bs
url = "https://www.astrospeak.com/horoscope/capricorn"
response = requests.get(url)
soup = bs(response.text, 'html.parser')
locater = soup.select("#sunsignPredictionDiv > div.fullDIV > div.lineHght18 > div")
quote = locater[0].previousSibling.strip()
So essentially I simplified your syntax by using just request.get
which is also documented in the requests docs. And added .strip()
. strip
is used to remove all whitespaces, this also includes newlines,\n
and tabs,\t
which are shown in their raw forms in a string. strip()
can also be used to remove leading and traling chars.
There is also lstrip()
and rstrip()
which basically translates to left leading or right trailing spaces respectively, that does the same thing. For examples and if you would like to read more, you can refer here
Upvotes: 3