johnred
johnred

Reputation: 97

how to edit data frames and convert to a list(pandas, read_html())?

I used library pandas, read_html() to import a table from a webpage. I want to insert values from table read_html in ms msl table but for this I must edit table read_html and convert to list. This is difficult to do because .read_html() produces a list of dataframes.

my python code:

import requests
import pandas as pd
r = requests.get('URL')
pd.set_option('max_rows',10000) 
df = pd.read_html(r.content)
print(df)

result print(df) - dataframes:

[             0                     1              2   3
0        Number                  Name           Plan NaN
1          NaN                   NaN   not(selected) NaN
2     53494580          + (53)494580         551 NaN
3     53494581          + (53)494581         551 NaN
4     53494582          + (53)494582         551 NaN
5     55110000          + (53)494583         551 NaN]

I would like the following results to be written to the ms msl table:

[['1','NaN','NaN','not(selected)','NaN'],
['2','53494580','+ (53)494580','NP_551','NaN'],
['3','53494581','+ (53)494581','NP_551','NaN'],
['4','53494582','+ (53)494582','NP_551','NaN'],
['5','55110000','+ (53)494583','NP_551','NaN]']

how to edit data frames and convert to a list? I would be grateful for any help.

Upvotes: 0

Views: 81

Answers (2)

jezrael
jezrael

Reputation: 862601

I think you need parameter header for first row to columns names and then [0] for select first value of list - it return DataFrame:

df = pd.read_html(r.content, header=0)[0]

For lists use values with tolist:

arr = df.values.tolist()

Upvotes: 2

zipa
zipa

Reputation: 27869

As it was mentioned in other answer you should select dataframe using:

df = pd.read_html(r.content, header=0)[0]

Then, to turn it into matrix just use:

df.as_matrix()

This will give you numpy ndarray that could be turned into nested list via:

df.as_matrix().tolist()

Upvotes: 1

Related Questions