Reputation: 167
The code I am using:
import urllib2
import csv
from bs4 import BeautifulSoup
url = "http://en.wikipedia.org/wiki/List_of_ongoing_armed_conflicts"
soup = BeautifulSoup(urllib2.urlopen(url))
fl = open('locations.csv', 'w')
def unique(countries):
seen = set()
for country in countries:
l = country.lower()
if l in seen:
continue
seen.add(l)
yield country
locs = []
for row in soup.select('table.wikitable tr'):
cells = row.find_all('td')
if cells:
for location in cells[3].find_all(text=True):
locs.extend(location.split())
locs2 = []
for locations in unique(locs):
locations = locs2.extend(locations.split())
print sorted(locs2)
writer = csv.writer(fl)
writer.writerow(['location'])
for values in sorted(locs2):
writer.writerow(values)
fl.close()
When I print the code I am writing I get a u'
in front of each element which I think is why it is outputting this way. I tried using .strip(u'')
but it gives me an error that .strip
cannot be used as it is a list.
What am I doing wrong?
Upvotes: 0
Views: 3075
Reputation: 1121724
locs2
is a list with strings, not a list of lists. As such you are trying to write individual strings as a row:
for values in sorted(locs2):
writer.writerow(values)
Here values
is a string, and writerow()
treats it as a sequence. Each element of whatever sequence you pass to that function will be treated as a separate column.
If you wanted to write all locations as one row, pass the whole list to writer.writerow()
:
writer.writerow(sorted(locs2))
If you wanted to write a new row for each individual location, wrap it in a list first:
for location in sorted(locs2):
writer.writerow([location])
You don't need to string u
prefixes from strings; that's just Python telling you you have Unicode string objects, not byte string objects:
>>> 'ASCII byte string'
'ASCII byte string'
>>> 'ASCII unicode string'.decode('ascii')
u'ASCII unicode string'
See the following information if you want to learn more about Python and Unicode:
The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) by Joel Spolsky
Pragmatic Unicode by Ned Batchelder
Upvotes: 1