Andreas
Andreas

Reputation: 167

sqlalchemy: Insert html table into mysql db

Im new to python (3) and would like to now the following:

I'm trying to collect data via pandas from a website and would like to store the results into a mysql database like:

import pandas as pd
from sqlalchemy import create_engine

engine = create_engine("mysql://python:"+'pw'+"@localhost/test?charset=utf8")

url = r'http://www.boerse-frankfurt.de/devisen'
dfs = pd.read_html(url,header=0,index_col=0,encoding="UTF-8")
devisen = dfs[9] #Select the right table
devisen.to_sql(name='table_fx', con=engine, if_exists='append', index=False)

I'm receiving the following error:

.... _mysql.connection.query(self, query) sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (1054, "Unknown column '\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tBezeichnung\n\t\t\t\t\t\t\t\n\t\t\t\t' in 'field list'") [SQL: 'INSERT INTO tbl_fx (\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tBezeichnung\n\t\t\t\t\t\t\t\n\t\t\t\t, \n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tzum Vortag\n\t\t\t\t\t\t\t\n\t\t\t\t, \n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tLetzter Stand\n\t\t\t\t\t\t\t\n\t\t\t\t, \n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tTageshoch\n\t\t\t\t\t\t\t\n\t\t\t\t, \n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tTagestief\n\t\t\t\t\t\t\t\n\t\t\t\t, \n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t52-Wochenhoch\n\t\t\t\t\t\t\t\n\t\t\t\t, \n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t52-Wochentief\n\t\t\t\t\t\t\t\n\t\t\t\t, \n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tDatum\n\t\t\t\t\t\t\t\n\t\t\t\t, \nAktionen\t\t\t\t) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s)'] [parameters: (('VAE Dirham', '-0,5421%', 45321.0, 45512.0, 45306.0, 46080.0, 38550.0, '20.02.2018 14:29:00', None), ('Armenischer Dram', '-0,0403%', 5965339.0, 5970149.0, 5961011.0, 6043443.0, 5108265.0, '20.02.2018 01:12:00', None), ....

How can sqlalchemy INSERT respective data into table_fx? Problem is the header with the multiple \n and \t.

The mysql table hase the following structur:

( name varchar(10) COLLATE utf8_unicode_ci DEFAULT NULL, bezeichnung varchar(150) COLLATE utf8_unicode_ci DEFAULT NULL, diff_vortag varchar(20) COLLATE utf8_unicode_ci DEFAULT NULL, last double DEFAULT NULL, day_high double DEFAULT NULL, day_low double DEFAULT NULL, 52_week_high double DEFAULT NULL, 52_week_low double DEFAULT NULL, date_time varchar(20) COLLATE utf8_unicode_ci DEFAULT NULL, unnamed varchar(200) COLLATE utf8_unicode_ci DEFAULT NULL )

Any help is higly welcome.

Thank you very much in advance

Andreas

Upvotes: 1

Views: 515

Answers (1)

tadamhicks
tadamhicks

Reputation: 925

This should do it. If you convert to a dataframe you can rename columns first. The "dfs" entity you were creating was actually a list of dataframe entities.

import pandas as pd
from sqlalchemy import create_engine

engine = create_engine("mysql://python:"+'pw'+"@localhost/test?charset=utf8")

url = r'http://www.boerse-frankfurt.de/devisen'
dfs = pd.read_html(url,header=0,index_col=0,encoding="UTF-8")

devisen = dfs[9].dropna(axis=0, thresh=4) # Select right table and make a DF

devisen.columns = devisen.columns.str.strip() # Strip extraneous characters

devisen.to_sql(name='table_fx', con=engine, if_exists='append', index=False)

Upvotes: 1

Related Questions