Daniel O.
Daniel O.

Reputation: 813

BeautifulSoup stripping whitespace

I am working on a basic horoscope parser from a website. Below is my code:

import requests
from bs4 import BeautifulSoup as bs

url = "https://www.astrospeak.com/horoscope/capricorn"

response = requests.request("GET", url)

soup = bs(response.text, 'html.parser')

locater = soup.select("#sunsignPredictionDiv > div.fullDIV > div.lineHght18 > div")

quote = locater[0].previousSibling

This leaves me with the following <class 'bs4.element.NavigableString'>:

"\n                      You are working towards yet another dream and as you pursue this vision there's no doubt in your mind that it will come to fruition. It's written in the stars! \n                      "

I am struggling how I can use the BeautifulSoup stripped_strings generator on the bs4.element.NavigableString. What I would like to end up with is just the string You are working towards yet another dream and as you pursue this vision there's no doubt in your mind that it will come to fruition. It's written in the stars!

Upvotes: 0

Views: 2772

Answers (1)

BernardL
BernardL

Reputation: 5434

I know the answer in the comment pretty much solves your problem, but I hope to give you some background:

import requests
from bs4 import BeautifulSoup as bs

url = "https://www.astrospeak.com/horoscope/capricorn"
response = requests.get(url)
soup = bs(response.text, 'html.parser')
locater = soup.select("#sunsignPredictionDiv > div.fullDIV > div.lineHght18 > div")

quote = locater[0].previousSibling.strip()

So essentially I simplified your syntax by using just request.get which is also documented in the requests docs. And added .strip(). strip is used to remove all whitespaces, this also includes newlines,\n and tabs,\t which are shown in their raw forms in a string. strip() can also be used to remove leading and traling chars.

There is also lstrip() and rstrip() which basically translates to left leading or right trailing spaces respectively, that does the same thing. For examples and if you would like to read more, you can refer here

Upvotes: 3

Related Questions