Reputation: 33
I have tried to scrape text details of a store location and write them to a csv using BeautifulSoup. 2 stores in Alabama are in class LocationSecContent and 17 stores in Arizona are in another class LocationSecContent. In Georgia, 1st store Airport is in single class called location inside the class LocationSecContent and the rest 4 in Georgia are in another class location inside LocationSecContent. I would like to scrape text details and write the store details like name,location,street,phone,fax,hourscontent and all details into a csv file. I'm using firebug in firefox. Sorry, if there are any mistakes, I'm a beginner to beautifulsoup.
here is what i have tried:
from bs4 import BeautifulSoup
import requests
page = requests.get('http://freshvites.com/store-locator/')
soup = BeautifulSoup(page.text, 'html.parser')
d={}
for table in soup.find_all("div", {"class":"content freshvites-location"}):
table
for col in table.find_all("td"):
LocationSecHdr=col.find_all("div",{'class':'LocationSecHdr'})
Location=col.find_all("div",{'class':'location'})
dt="LocationSecHdr:%s,Location: %s" %(LocationSecHdr, Location)
zx=BeautifulSoup(dt, "html.parser")
print zx.get_text()
I'm not able to iterate through rows and scrape the text.
Method 2:
from bs4 import BeautifulSoup
import requests
page = requests.get('http://freshvites.com/store-locator/')
#print page
soup = BeautifulSoup(page.text, 'html.parser')
#print soup.find_all('a')
for table in soup.find_all("div",{'class':'content freshvites-location'}):
table
LocationSecHdr=''
LocationSecContent=''
Location=''
LocationTitle=''
Phone=''
Fax=''
HoursTitle=''
HoursContent=''
for col in table.find_all("td"):
LocationSecHdr=col.find_all("div",{'class':'LocationSecHdr'})
#LocationSecContent= col.find_all("div",{'class':'LocactionSecContent'})
#Location= col.find_all("div",{'class':'location'})
LocationTitle= col.find_all("div",{'class':'locationTitle'})
Phone= col.find_all("div",{'class':'Phone'})
Fax= col.find_all("div",{'class':'Fax'})
HoursContent=col.find_all("div",{'class':'HoursContent'})
data="LocationSecHdr: %s, LocationSecContent: %s, Location:%s, LocationTitle : %s, Phone:%s, Fax :%s, HoursContent:%s " %(LocationSecHdr, LocationSecContent, Location, LocationTitle, Phone, Fax, HoursContent)
zax=BeautifulSoup(data,"html.parser")
print zax.get_text()
If I try this code, i can't get the address of the store and I don't know how to get these details as a dict too
Upvotes: 0
Views: 496
Reputation: 982
I think I have enough information now to answer your question.
You are looking for the wrong tag/class combination. All informations for a location are contained inside of a <div class="location">
. Here is a sample:
<div class="location">
<div class="locationTitle">32nd Street & Thunderbird</div>
Fresh Vitamins<br>
13802 N. 32nd St #11<br>
Phoenix, AZ 85032<br>
<div class="Phone"> </div>
<div class="Fax">877.935.6902</div>
<div class="HoursTitle">Hours:</div>
<div class="HoursContent">9am - 7pm M-F<br> 9am - 6pm Sat<br> 11am - 4pm Sun</div>
</div>
As you can see in the sample there is no <tr>
or <td>
so looking for that doesn't really make sense.
Here's a short python script I put together to find all locations:
from bs4 import BeautifulSoup
import requests
page = requests.get('http://freshvites.com/store-locator/')
soup = BeautifulSoup(page.content, 'html.parser')
for div in soup.find_all("div", {"class":"location"}):
print(div)
Now you just need to filter the information you need from div
. Everything you need for that should be easy to find on so.
Upvotes: 1