Reputation: 329
So, I'm doing a web scraping using Beautiful Soup
Let say I want to store the data to csv for every loop below
containers = page_soup.findAll("div",{"class":"product-img" })
filename = "result.csv"
f = open(filename, "w")
headers = "Link,Image\n"
f.write(headers)
for container in containers:
items = container.findAll("a")
for item in items:
datalink = item.attrs['href']
dataimg = item.attrs['src']
f.write(datalink + "," + dataimg + "\n")
f.close()
When I open the csv file with excel ,
The data is crumble into 1 column instead of 2 column
What I got :
Column A
Link,Image
link1,img1
link2,img2
link3,img3
link4,img4
link5,img5
What Expected :
Column A Column B
Link Image
link1 img1
link2 img2
link3 img3
link4 img4
link5 img5
Upvotes: 0
Views: 101
Reputation: 3716
Short answer: use a library. In this case, Python's built-in csv
module:
Longer answer: without the exact code, and the data input you're using, the best I can do is guess at the problem. But, in short, due to edge cases and embedded quotes and commas, CSV is slightly more complicated than you think. Use a library that has already sussed out the details.
Additionally, don't "C-think" with the file manipulation. Use with
. That is:
# "Bad"
f = open('somefile', 'w')
f.write( data )
f.close()
# Good
with open('somefile', 'w') as f:
f.write( data )
I'll leave it to the docs (section 7.2), to explain why with
is the (much) better path.
Finally, example code to get you on your way:
import csv
containers = page_soup.findAll("div",{"class":"product-img" })
filename = "result.csv"
with open(filename, "w") as f:
csvwriter = csv.writer(f, quoting=csv.QUOTE_MINIMAL)
csvwriter.writerow( ('Link', 'Image') )
for container in containers:
items = container.findAll("a")
for item in items:
datalink = item.attrs['href']
dataimg = item.attrs['src']
csvwriter.writerow( (datalink, dataimg))
Upvotes: 1