Reputation: 20302
I think this block of code is pretty close to being right, but something is throwing it off. I'm trying to loop through 10 URLs and download the contents of each to a text file, and make sure everything is structured orderly, in a dataframe
.
import pandas as pd
rawHtml = ''
url = r'http://www.pga.com/golf-courses/search?page=" + i + "&searchbox=Course%20Name&searchbox_zip=ZIP&distance=50&price_range=0&course_type=both&has_events=0'
g = open("C:/Users/rshuell001/Desktop/MyData.txt", "w")
for i in range(0, 10):
df = pd.DataFrame.from_csv(url)
print(df)
g.write(str(df))
g.close()
The error that I get says:
CParserError: Error tokenizing data.
C error: Expected 1 fields in line 22, saw 2
I have no idea what that means. I only have 9 lines of code, so I don't know why it's mentioning a problem on line 22.
Can someone give me a push to get this working?
Upvotes: 0
Views: 751
Reputation: 20302
I finally got it working. This is what I was trying to do all along.
import requests
from bs4 import BeautifulSoup
link = "http://www.pga.com/golf-courses/search?page=1&searchbox=Course%20Name&searchbox_zip=ZIP&distance=50&price_range=0&course_type=both&has_events=0"
html = requests.get(link).text
soup = BeautifulSoup(html, "lxml")
res = soup.findAll("div", {"class": "views-field-nothing"})
for r in res:
print("Address: " + r.find("span", {'class': 'field-content'}).text)
Upvotes: 1
Reputation: 17679
pandas.DataFrame.from_csv()
takes a first argument which is either a path or a file-like handle, where either are supposed to be pointing at valid CSV file.
You are providing it with a URL.
It seems that you want to use a different function: the top-level pandas.read_csv
. This function will actually fetch the data from you from a valid URL, then parse it.
If for any reason you insist on using pandas.DataFrame.from_csv()
, you will have to:
pandas.DataFrame.from_csv()
.Upvotes: 1