vivace
vivace

Reputation: 57

Convert list into DataFrame (pandas) in python

I succeeded to parse url into LIST format, but somehow when I use pd.DataFrame() all the data resets. Can you please help me where I get wrong?

These are what I've scraped:

#currency 
URL = "https://www.xe.com/currencytables/?from=USD&date=2019-05-01"
data = requests.get(URL).text

#parse url
soup = bs(data, "html.parser")

#find the tables you want
table = soup.findAll("table")[0:1]

#read it into pandas
FXrate = pd.read_html(str(table))
FXrate

and this works.

Problem occurs when:

FXrate = pd.DataFrame(FXrate)
FXrate

From what I've known, I just converted format from list to DataFrame, but somehow the whole table doesn't come up well.

Upvotes: 2

Views: 57

Answers (2)

prosti
prosti

Reputation: 46469

Just one sidenote. The read_html table works with <table><tr><td> tags. So it can comprehend the tables that way. This one will work.

<table>
<tbody>
<tr>
<td>&nbsp;</td>
<td>&nbsp;</td>
</tr>
<tr>
<td>&nbsp;</td>
<td>&nbsp;</td>
</tr>
</tbody>
</table>

It will not work on div tables though. This one will not work.

<div class="divTable">
<div class="divTableBody">
<div class="divTableRow">
<div class="divTableCell">&nbsp;</div>
<div class="divTableCell">&nbsp;</div>
</div>
<div class="divTableRow">
<div class="divTableCell">&nbsp;</div>
<div class="divTableCell">&nbsp;</div>
</div>
</div>
</div>

Upvotes: 0

jezrael
jezrael

Reputation: 863501

You can pass url link to read_html and select first value of list of DataFrames by indexing - [0]:

URL = "https://www.xe.com/currencytables/?from=USD&date=2019-05-01"

FXrate = pd.read_html(URL)[0]
print (FXrate.head())
  Currency code  ▲▼  Currency name  ▲▼  Units per USD  USD per Unit
0               USD          US Dollar       1.000000      1.000000
1               EUR               Euro       0.889216      1.124586
2               GBP      British Pound       0.764041      1.308830
3               INR       Indian Rupee      69.564191      0.014375
4               AUD  Australian Dollar       1.420778      0.703840

If need second table:

FXrate = pd.read_html(URL)[1]
print (FXrate.head())
    Currency       Rate Unnamed: 2
0  EUR / USD    1.11483          ▼
1  GBP / EUR    1.13897          ▼
2  USD / JPY  110.13300          ▼
3  GBP / USD    1.26976          ▼
4  USD / CHF    1.01103          ▼

Upvotes: 1

Related Questions