Reputation: 174
I am trying to implement get request to website, getting the html and appending it to a list. The problem is that it adds \n
in random places, and I need to write a script to get rid of that problem. I have tried strip()
and replace()
and everything in between.
Here is my code:
r = requests.get(page)
data = r.text
html = BeautifulSoup(data, "html.parser")
for lin in html.find_all("link", href=True):
if "css" in lin['href']:
urls.append(lin['href'])
for url in urls:
if "http" in url:
sourcecode.append(data)
I just need to eliminate \n
from the source code.
Upvotes: 0
Views: 111
Reputation: 174
i solved this problem by opening the file in binary mode!
f = open("file", "ab+")
Upvotes: 0
Reputation: 39
I hope that resolve your problem. I checked it on some page and it worked.
r = requests.get(page)
data = r.text
html = BeautifulSoup(data, "html.parser")
for lin in html.find_all("link", href=True):
if "css" in lin['href']:
urls.append(lin['href'].replace("\n", ""))
for url in urls:
if "http" in url:
sourcecode.append(data)
Upvotes: 1