Reputation: 21
Python noob here. I'm trying to print lines that contain a substring in an HTML file with Python. I know that the string is in the file because when I ctrl+f the string I'm searching for in the html file I find it. However when I run my code it doesn't print the desired result. Could someone explain what I'm doing wrong?
import requests
import datetime
from BeautifulSoup import BeautifulSoup
now =datetime.datetime.now()
cmonth = now.month
cday = now.day
cyear = now.year
find = 'boxscores/201'
url = 'http://www.basketball-reference.com/boxscores/index.cgi?lid=header_dateoutput&month={0}&day=17&year={2}'.format(cmonth,cday,cyear)
response = requests.get(url)
html = response.content
print html
for line in html:
if find in line:
print line
Upvotes: 1
Views: 8850
Reputation: 51
As snakecharmerb said, by using
for line in html :
you iterate over the characters of html when it's a string, not the lines. But you can use
for line in html.split("\n") :
to iterate over the lines.
Upvotes: 2
Reputation: 55630
In the requests package response.content is a string, so you should search like this:
if find in html:
# do something
By iterating over response.content with
for line in html
you are iterating over the individual characters in the string, not lines.
Upvotes: 1