Reputation: 1487
I am trying to get a bunch of tables from Wikipedia,this is my code
from urllib import urlopen
from bs4 import BeautifulSoup
import csv
url="https://en.wikipedia.org/wiki/List_of_colors:_A%E2%80%93F"
html=urlopen(url)
soup=BeautifulSoup(html,'html.parser')
table=soup.find('table',class_='wikitable sortable')
rows=table.findAll('tr')
csvFile=open("colors.csv",'w+')
writer=csv.writer(csvFile)
try:
for row in rows:
csvRow=[]
for cell in row.findAll(['td','th']):
csvRow.append(cell.get_text().decode("utf-8"))
try:
writer.writerow(csvRow)
except AttributeError:
print "--"
continue
except UnicodeEncodeError:
print "=="
finally:
csvFile.close()
I wanted to write a simple code but i got so many errors so i added some exceptions to fix,but i am still getting only the first row,any help is appreciated
Upvotes: 0
Views: 67
Reputation: 3907
You want to encode, not decode.
from urllib import urlopen
from bs4 import BeautifulSoup
import csv
url="https://en.wikipedia.org/wiki/List_of_colors:_A%E2%80%93F"
html=urlopen(url)
soup=BeautifulSoup(html,'html.parser')
table=soup.find('table',class_='wikitable sortable')
rows=table.findAll('tr')
csvFile=open("colors.csv",'w+')
writer=csv.writer(csvFile)
for row in rows:
csvRow=[]
for cell in row.findAll(['td','th']):
csvRow.append(cell.get_text().encode("utf-8"))
print(cell.get_text())
writer.writerow(csvRow)
csvFile.close()
Upvotes: 1