Rivered
Rivered

Reputation: 789

How to add new HTML tag without ending statement in Python utilizing BS4

I am trying to create some HTML output with python, but am not able to get the correct formatting. I would like the closure statements of the break tags to not be included. Currently I am able to generate the following HTML:

item["PRICE"] = 'US$ 68.83'
item["PUB_DATE"] = '1974'
item["SHIPPING"] = 'US$ 14.16 Shipping'

from bs4 import BeautifulSoup

html = """
<html>
<head>
</head>
<body>
</html>
"""
#Create HTML object
soup = BeautifulSoup(html)
body=soup.find("body")

#Next we need to add br elements PRICE
PRICE = soup.new_tag("br")
PRICE.string = item["PRICE"]
soup.body.append(PRICE)

#PUB_DATE
PUB_DATE = soup.new_tag("br")
PUB_DATE.string = item["PUB_DATE"]
soup.body.append(PUB_DATE)

#SHIPPING
SHIPPING = soup.new_tag("br")
SHIPPING.string = item["SHIPPING"]
soup.body.append(SHIPPING)

print(soup)

#Yields
<html>
<head>
</head>
<body>
<br>US$ 68.83</br>
<br>1974</br>
<br>US$ 14.16 Shipping</br>
</body></html>

Desired outcome:

<html>
<head>
</head>
<body>
<br>US$ 68.83
<br>1974
<br>US$ 14.16 Shipping
</body></html>

The last output does not yield any white spaces between lines, whereas the first output does. I was not able to find any documentation on .new_tag() statement excluding closure statement. In addition, needing three lines to add a
tag with information seems very unpythonic to start off with?

Upvotes: 0

Views: 91

Answers (1)

chitown88
chitown88

Reputation: 28565

You're right, I didn't see it in the documentation. It would be nice to have a parameter to not include closing tags. Like make the default True, but have a way to change it to False if wanted. I suppose you could just make a simple function to do that though if you were inclined.

But without that, I think you got 3 options here.

  1. Just use div as the .new_tag() instead of br to get the desired output of having the content on a new line with no extra space.
  2. Since it is a relatively simple task, bypass bs4's .new_tag() function and just insert your desired tag and string:
  3. Remove the closing tag after adding string to the new tag

Option2:

item = {}

item["PRICE"] = 'US$ 68.83'
item["PUB_DATE"] = '1974'
item["SHIPPING"] = 'US$ 14.16 Shipping'

from bs4 import BeautifulSoup

html = """
<html>
<head>
</head>
<body>
</html>
"""
#Create HTML object
soup = BeautifulSoup(html)
body=soup.find("body")



#Next we need to add br elements PRICE
soup.body.append(BeautifulSoup(f'<br>{item["PRICE"]}\n', 'html.parser'))

#PUB_DATE
soup.body.append(BeautifulSoup(f'<br>{item["PUB_DATE"]}\n', 'html.parser'))


#SHIPPING
soup.body.append(BeautifulSoup(f'<br>{item["SHIPPING"]}\n', 'html.parser'))


print(soup)

Option 3:

item = {}

item["PRICE"] = 'US$ 68.83'
item["PUB_DATE"] = '1974'
item["SHIPPING"] = 'US$ 14.16 Shipping'

from bs4 import BeautifulSoup

html = """
<html>
<head>
</head>
<body>
</html>
"""
#Create HTML object
soup = BeautifulSoup(html)
body=soup.find("body")

#Next we need to add br elements PRICE
PRICE = soup.new_tag("br")
PRICE.string = item["PRICE"]
PRICE = BeautifulSoup(str(PRICE).replace('</br>', '\n'), 'html.parser')
soup.body.append(PRICE)

#PUB_DATE
PUB_DATE = soup.new_tag("br")
PUB_DATE.string = item["PUB_DATE"]
PUB_DATE = BeautifulSoup(str(PUB_DATE).replace('</br>', '\n'), 'html.parser')
soup.body.append(PUB_DATE)

#SHIPPING
SHIPPING = soup.new_tag("br")
SHIPPING.string = item["SHIPPING"]
SHIPPING = BeautifulSoup(str(SHIPPING).replace('</br>', '\n'), 'html.parser')
soup.body.append(SHIPPING)

print(soup)

Output:

<html>
<head>
</head>
<body>
<br/>US$ 68.83
<br/>1974
<br/>US$ 14.16 Shipping
</body></html>

Upvotes: 2

Related Questions