Reputation: 2373
I have been trying to download data from a website and then save it to a csv file. The problem is: I cannot save it in a proper way to read it or to import into a database.
Here is my code:
import csv
import requests
from bs4 import BeautifulSoup
def getData(url_to_scrap='https://www.investing.com/currencies/eur-usd-historical-data', file=None, save_file="Name.csv"):
if url_to_scrap is not None:
header = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'}
r = requests.get(url_to_scrap, headers=header)
data = BeautifulSoup(r.content, 'html.parser')
else:
data = BeautifulSoup(open(file, encoding='utf8'), 'html.parser')
table = data.find(id='curr_table')
table = table.find_all('td')
row_data = []
for row in table:
row_data.append(row.get_text('data-real-value'))
with open(save_file, 'w') as save:
for row in row_data:
writer = csv.writer(save, delimiter=';')
writer.writerow(row)
getData(save_file="EUR USD Historical Data.csv")
And the output on the CSV file:
M;a;y; ;3;1;,; ;2;0;1;7
1;.;1;2;1;8
1;.;1;1;7;2
1;.;1;2;2;0
....
What I need:
May 31, 2017;1.1218;1.1172;1.1220;1.1165;0.30%
If you check the website, everything is in a table and I need it similar in csv. What should I change to make it work?
Upvotes: 0
Views: 1098
Reputation: 54267
Make sub lists of length six each (= 1 row) from your row_data
list (via):
with open(save_file, 'wb') as save:
writer = csv.writer(save, delimiter=';')
for row in [row_data[x:x+6] for x in xrange(0, len(row_data), 6)]:
writer.writerow(row)
#May 31, 2017;1.1242;1.1172;1.1252;1.1165;0.51%
#May 30, 2017;1.1185;1.1163;1.1207;1.1108;0.18%
#May 29, 2017;1.1165;1.1177;1.1192;1.1159;-0.16%
#...
(As noted here, xrange
may need to be range
if you are using Python 3; I'm on Python 2 and a beginner, so...)
Upvotes: 1
Reputation: 549
EDIT
When you are appending the strings to the row list, append it as a list
row_data.append([row.get_text('data-real-value')])
so that you have a list of lists of strings.
see Python csv library leaves empty rows even when using a valid lineterminator for a better example.
Keep the writerows
as below to print a single item per line.
with open(save_file, 'w') as save:
writer = csv.writer(save, delimiter=';')
writer.writerows(row)
since it will place the delimiter between each element in the list
Upvotes: 0