gython
gython

Reputation: 875

Is there a way to loop through a list with a regular expression?

Basically I am trying scrape all HTML tags from a list of HTML files. When I am trying to do this I am getting the error:

TypeError: expected string or bytes-like object.

So is there a way to iterate over a list with regex?

Here is the code I am using:

import pymssql
import re

conn = pymssql.connect(
    host='xxx',
    port=xxx,
    user='xxx',
    password='xxx',
    database='xxxx'
)
cursor = conn.cursor() 
cursor.execute('SELECT 'column' FROM 'table'')

text = cursor.fetchall()

conn.close()

raw = []  
raw.append(text)

str(raw)

x = re.sub('<[^<]+?>', '', raw)

Upvotes: 1

Views: 928

Answers (2)

zkdev
zkdev

Reputation: 315

Check out the BeautifulSoup package. It's an HTML parser which you can treat like a normal python dictionary.

Upvotes: 0

Dani Mesejo
Dani Mesejo

Reputation: 61930

The error:

TypeError: expected string or bytes-like object.

refers to the fact that raw points to a list object, to point it to a string. You need to do:

raw = str(raw)  # instead of just str(raw)

but, if text is indeed a string why not just:

x = re.sub('<[^<]+?>', '', text)

For more details see the documentation on str, the quote below is from there:

Return a str version of object. See str() for details.

Upvotes: 1

Related Questions