Reputation: 29
Sorry to ask this ! I'm newbie so feel free to teach me anything you guys know. I'm making a scraping tool for my marketing purpose to scrape contact information from website. I'm using Python 3 This is my code:
import requests, bs4, os, codecs, csv
import pandas as pd
import sys
os.path.join('usr', 'bin', 'spam')
openFile = open('C:\\Users\\hdtra\\Desktop\\Test_1.csv',encoding='utf-8-sig')
read_test = csv.reader(openFile)
for link in read_test :
res = requests.get(link)
res.raise_for_status
facebookSpider = bs4.BeautifulSoup(res.text)
email = facebookSpider.select("._4-u2._3xaf._3-95._4-u8")
helloFile = open('C:\\Users\\hdtra\\Desktop\\In processing\\information.txt','w')
helloFile.write(str(email[3].encode('utf-8')) + '\n')
helloFile.close()
Have no idea why it gets me st like this:
Traceback (most recent call last):
File "C:\Users\hdtra\Desktop\In processing\Facebook_spider.py", line 12, in <module>
res = requests.get(link)
File "C:\Program Files\Python36\lib\site-packages\requests\api.py", line 72, in get
return request('get', url, params=params, **kwargs)
File "C:\Program Files\Python36\lib\site-packages\requests\api.py", line 58, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Program Files\Python36\lib\site-packages\requests\sessions.py", line 508, in request
resp = self.send(prep, **send_kwargs)
File "C:\Program Files\Python36\lib\site-packages\requests\sessions.py", line 612, in send
adapter = self.get_adapter(url=request.url)
File "C:\Program Files\Python36\lib\site-packages\requests\sessions.py", line 703, in get_adapter
raise InvalidSchema("No connection adapters were found for '%s'" % url)
requests.exceptions.InvalidSchema: No connection adapters were found for '['http://www.facebook.com/D2Streetwear/?ref=br_rs']'
I know that get()
only gets string, but have no idea how to convert these links into strings. This is my cvs file:
only one column with 5 row:
http://www.facebook.com/D2Streetwear/?ref=br_rs
https://www.facebook.com/RealClothes/?ref=br_rs
https://www.facebook.com/Lecamelliaclothing/?ref=br_rs
https://www.facebook.com/TaTclothing-285844471884952/?ref=br_rs
https://www.facebook.com/Dai-Clothing-130675847640538/?ref=br_rs
I tried to put str(link())
but it does not work.
Upvotes: 1
Views: 3680
Reputation: 402783
You should understand that csv.reader
returns an iterator that iterates over each row to return a list of columns for each one.
csv
.reader
(csvfile, dialect='excel', **fmtparams
)Return a reader object which will iterate over lines in the given
csvfile
.[...]
Each row read from the csv file is returned as a list of strings.
Bold emphasis mine. Your CSV appears to contain a single column, so you can access the first column using link[0]
.
with open('test.csv') as f:
r = csv.reader(f)
for row in r:
r = requests.get(row[0])
...
I consider it good practice to always use a with...as
context manager when handling file I/O, as it automatically closes your file and results in cleaner code.
Upvotes: 1