Reezal AQ
Reezal AQ

Reputation: 117

Python-BeautifulSoup : How to save html into database?

I'm trying to save a product description into MySQL database. So far, I have tried changing datatype to BLOB, LONGBLOB, TEXT, LONGTEXT but it's not working.

    import bs4
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
import mysql.connector

cnx = mysql.connector.connect(user='root', password='Kradz579032!!',
                              host='127.0.0.1',
                              database='aliexpressapidb')
cursor = cnx.cursor()

add_data = ("INSERT INTO productdetails"
               "(description) "
               "VALUES (%s)")

my_url = 'https://www.aliexpress.com/item/Cheap-Necklace-Jewelry-Alloy-Men-Vintage-Personality-Pendant-Creativity-Simple-Accessories-Symbol-Necklace-Wholesale-Fashion/32879629913.html?spm=a2g01.11147086.layer-iabdzn.4.4a716140Ix00VA&scm=1007.16233.91830.0&scm_id=1007.16233.91830.0&scm-url=1007.16233.91830.0&pvid=acdbf117-c0fb-458f-b8a9-ea73bc0d174b'
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "html.parser")
description = page_soup.findAll("div", {"class": "ui-box-body"})
#print(description)
data_insert = description
cursor.execute(add_data, data_insert)




cnx.commit()

cursor.close()
cnx.close()

Keep getting error :

File "/Users/reezalaq/.conda/envs/untitled3/lib/python3.6/site-packages/mysql/connector/conversion.py", line 160, in to_mysql return getattr(self, "_{0}_to_mysql".format(type_name))(value) AttributeError: 'MySQLConverter' object has no attribute '_tag_to_mysql' "MySQL type".format(type_name)) TypeError: Python 'tag' cannot be converted to a MySQL type

Upvotes: 0

Views: 1619

Answers (3)

abdul rashid
abdul rashid

Reputation: 750

maindiv = str(soup.find("body"))

this returns body tag html as String.

sql = "UPDATE `abc` SET `html` = %s WHERE `abc`.`id` = %s"
val = (maindiv,1)
mycursor.execute(sql, val)
mydb.commit()

Above code updates the record

Upvotes: 0

Reezal AQ
Reezal AQ

Reputation: 117

problem solved by converting the html data into string as suggested by @MicahB.

tostring = str(data)

Upvotes: 0

Mahendra Singh Meena
Mahendra Singh Meena

Reputation: 608

description = page_soup.findAll("div", {"class": "ui-box-body"})

^ here findAll will return a list of matching tag, but you need to pass a string cursor.execute(). Thus, you should get a string from those matching tags. Following could be one way to do this:

description = page_soup.findAll("div", {"class": "ui-box-body"})
data_insert = description[0].get_text()
cursor.execute(add_data, data_insert)

Upvotes: -1

Related Questions