Reputation: 43
I am having an issue with the below code.
import urllib2
import csv
from bs4 import BeautifulSoup
soup = BeautifulSoup(urllib2.urlopen('http://www.ny.com/clubs/nightclubs/index.html').read())
clubs = []
trains = ["A","C","E","1","2","3","4","5","6","7","N","Q","R","L","B","D","F"]
for club in soup.find_all("dt"):
clubD = {}
clubD["name"] = club.b.get_text()
clubD["address"] = club.i.get_text()
text = club.dd.get_text()
nIndex = text.find("(")
if(text[nIndex+1]=="2"):
clubD["number"] = text[nIndex:nIndex+15]
sIndex = text.find("Subway")
sIndexEnd = text.find(".",sIndex)
if(text[sIndexEnd-1] == "W" or text[sIndexEnd -1] == "E"):
sIndexEnd2 = text.find(".",sIndexEnd+1)
clubD["Subway"] = text[sIndex:sIndexEnd2]
else:
clubD["Subway"] = text[sIndex:sIndexEnd]
try:
cool = clubD["number"]
except (ValueError,KeyError):
clubD["number"] = "N/A"
clubs.append(clubD)
keys = [u"name", u"address",u"number",u"Subway"]
f = open('club.csv', 'wb')
dict_writer = csv.DictWriter(f, keys)
dict_writer.writerow([unicode(s).encode("utf-8") for s in clubs])
I get the error ValueError: dict contains fields not in fieldnames. I dont understand how this could be. Any assistance would be great. I am trying to turn the dictionary into an excel file.
Upvotes: 2
Views: 294
Reputation: 40723
clubs
is a list of dictionaries, whereas each dictionary has four fields: name, address, number, and Subway. You will need to encode each of the fields:
# Instead of:
#dict_writer.writerow([unicode(s).encode("utf-8") for s in clubs])
# Do this:
for c in clubs:
# Encode each field: name, address, ...
for k in c.keys():
c[k] = c[k].encode('utf-8').strip()
# Write to file
dict_writer.writerow(c)
I looked at your data and some of the fields have ending new line \n
, so I updated the code to encode and strip white spaces at the same time.
Upvotes: 2