Reputation: 1
I am scraping some contents from this site. While writing like the conference head after extracting from the site in csv
file the first name is not coming properly, e.g. if the word is microsoft
it is coming as osoft
but rest all of the words are coming properly
Here is my code:
import csv
import requests
from bs4 import BeautifulSoup
with open('random.csv', 'w') as csvfile:
a = csv.writer(csvfile)
a.writerow(["conferenceHead"])
url = given above
r = requests.get(url)
soup = BeautifulSoup(r.content)
links = soup.find_all("div")
r_data = soup.find_all("div",{"class":"conferenceHead"})
for item in r_data:
conferenceHead = item.contents[1].text
with open('random.csv','a') as csvfile:
a = csv.writer(csvfile)
data = [conferenceHead]
a.writerow(data)
Upvotes: 0
Views: 39
Reputation: 710
Well, You have three issues in Your code.
with open()
statements (on the same file)This might cause buffer not being written to file, and truncating string You are saving.
After fixing this errors (removing with open('random.csv','a') as csvfile
and fixing indentation) code runs and output is not trimmed.
import csv
import requests
from bs4 import BeautifulSoup
with open('random.csv', 'w') as csvfile:
a = csv.writer(csvfile)
a.writerow(["conferenceHead"])
url = "http://www.allconferences.com/search/index"\
"/Category__parent_id:1/Venue__country:United%20States"\
"/Conference__start_date__from:01-01-2010/sort:start_date"\
"/direction:asc/showLastConference:1/page:7/"
r = requests.get(url)
soup = BeautifulSoup(r.content)
links = soup.find_all("div")
r_data = soup.find_all("div",{"class":"conferenceHead"})
for item in r_data:
conferenceHead = item.contents[1].text
data = [conferenceHead]
a.writerow(data)
Upvotes: 1