Reputation: 15
Write a program that asks the user for a URL. It should then retrieve the contents of the page at that URL and print how many
<p>
tags are on that page. Your program should just print an integer.
Here's my code:
import urllib.request
link = input('Enter URL: ')
response = urllib.request.urlopen(link)
html = response.read()
counter = 0
for '<p>' in html:
counter += 1
print(counter)
However, I got this error:
Traceback (most recent call last):
File "python", line 16
SyntaxError: can't assign to literal
What would be the better method to execute this code? Should I use the find
method instead?
Upvotes: 1
Views: 989
Reputation: 642
This code works good:
from lxml import html
import requests
page = requests.get(input('Enter URL: '))
root = html.fromstring(page.content)
print(len(root.xpath('//p')))
Upvotes: 0
Reputation: 746
Try to use BeautifulSoup
from bs4 import BeautifulSoup
import requests
link = input('Enter URL: ')
response = requests.get(link)
html = response.text
soup = BeautifulSoup(html, 'lxml')
tags = soup.findAll('p')
print(len(tags))
Upvotes: 1
Reputation: 181
First of all response.read()
returns bytes; hence you need to cast it to string:
html = str(response.read())
then, no need for for
loop, you can just use count = html.counter('<p>')
Hope it help.s
Upvotes: 1