Reputation: 101
I'm grabbing search results from bing. Everything is working except the output to the csv file. I've tried pandas also but can't seem to get the output right. I need the "url" in column A and "name" in column B next to the corresponding link.
def scrape():
urls = WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CLASS_NAME, "b_algo")))
url = [div.find_element_by_tag_name('a').get_attribute('href') for div in urls]
names = WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CLASS_NAME, "b_algo")))
name = [div.find_element_by_tag_name('h2 > a').get_attribute('innerHTML').split('-')[0].strip() for div in names]
x1 = [url]
x2 = [name]
pp.pprint([url,name])
with open(bing_parameters.file_name, 'a', newline='\n', encoding='utf-8') as f:
wr = csv.writer(f)
for items in x1:
wr.writerow([x1,x2])
scrape()
Upvotes: 0
Views: 49
Reputation: 55894
Let's say you have this data:
x1 = ['foo']
x2 = ['https://www.example.com']
Then your existing code is doing something like this
for items in x1:
print([x1, x2])
Giving this incorrect output:
[['foo'], ['https://www.example.com']]
The code is looping over the contents of x1
- a list containing one item, so the loop will have one iteration - and outputting a list containing x1
and x2
, which are both lists.
If x1
and x2
are always single item lists you can explicitly select the first item in each, and dispense with the loop:
with open(bing_parameters.file_name, 'a', newline='\n', encoding='utf-8') as f:
wr = csv.writer(f)
wr.writerow([x1[0], x2[0]])
or just not make these redundant lists
with open(bing_parameters.file_name, 'a', newline='\n', encoding='utf-8') as f:
wr = csv.writer(f)
wr.writerow([name, url])
If x1
and x2
contain multiple corresponding items, you can zip them together:
x1 = [name1, name2]
x2 = [url1, url2]
with open(bing_parameters.file_name, 'a', newline='\n', encoding='utf-8') as f:
wr = csv.writer(f)
for name, url in zip(x1, x2):
wr.writerow([name, url])
or even
x1 = [name1, name2]
x2 = [url1, url2]
with open(bing_parameters.file_name, 'a', newline='\n', encoding='utf-8') as f:
wr = csv.writer(f)
wr.writerows(zip(x1, x2))
Upvotes: 0
Reputation: 9969
Try this out. To put url to the first column and name to second column and then write to csv.
import pandas as pd
df = pd.DataFrame(url)
df.columns =['A']
df['B']=name
print(df)
df.to_csv(bing_parameters.file_name, index=False)
Upvotes: 1