Reputation: 982
I'm new to MySQL but have a solid foundation in Python. After researching this extensively over the last 2 days including reading many stackoverflow questions and answers, I still haven't been able to resolve the issue so any help specific to this problem would be appreciated. UPDATED: The error is posted below.
I am trying to create a database which retrieves daily price data from yahoo and inputs the data into the corresponding table.
The MySQL tables and database were created using the MySQL Workbench 6.1. I'm using Python 2.7 Anaconda distribution on Windows 8.1 64 bit.
Here is the MySQL table:
-- -----------------------------------------------------
-- Table `securities_master_00`.`daily_price`
-- -----------------------------------------------------
CREATE TABLE IF NOT EXISTS `securities_master_00`.`daily_price` (
`id` INT NOT NULL AUTO_INCREMENT,
`data_vendor_id` INT NOT NULL,
`symbol_id` INT NOT NULL,
`price_date` DATETIME NOT NULL,
`created_date` DATETIME NOT NULL,
`last_updated_date` DATETIME NOT NULL,
`open_price` DECIMAL(19,4) NULL DEFAULT NULL,
`high_price` DECIMAL(19,4) NULL DEFAULT NULL,
`low_price` DECIMAL(19,4) NULL DEFAULT NULL,
`close_price` DECIMAL(19,4) NULL DEFAULT NULL,
`adj_close_price` DECIMAL(19,4) NULL DEFAULT NULL,
`volume` BIGINT NULL DEFAULT NULL,
INDEX `index_data_vendor_id_idx` (`data_vendor_id` ASC),
PRIMARY KEY (`id`),
INDEX `index_symbol_id_idx` (`symbol_id` ASC),
CONSTRAINT `index_data_vendor_id`
FOREIGN KEY (`data_vendor_id`)
REFERENCES `securities_master_00`.`data_vendor` (`id`)
ON DELETE NO ACTION
ON UPDATE CASCADE,
CONSTRAINT `index_symbol_id`
FOREIGN KEY (`symbol_id`)
REFERENCES `securities_master_00`.`symbol` (`id`)
ON DELETE NO ACTION
ON UPDATE CASCADE)
ENGINE = InnoDB
DEFAULT CHARACTER SET = utf8;
Here is the corresponding python code that generates the error:
def insert_daily_data_into_db(data_vendor_id, symbol_id, daily_data):
'''
Takes a list of tuples of daily data and adds it to the MySQL database.
Appends the vendor ID and symbol ID to the data.
daily_data: List of tuples of the OHLC data (with adj_close and volume)
'''
# Create the time now
now = datetime.datetime.utcnow()
# Amend the data to include the vendor ID and symbol ID
daily_data = [(data_vendor_id, symbol_id, d[0], now, now,
d[1], d[2], d[3], d[4], d[5], d[6]) for d in daily_data]
# Create the insert strings
column_str = '''data_vendor_id, symbol_id, price_date, created_date,
last_updated_date, open_price, high_price, low_price,
close_price, volume, adj_close_price'''
insert_str = ('%s, ' * 11)[:-2]
final_str = 'INSERT INTO daily_price (%s) VALUES (%s)' % \
(column_str, insert_str)
# Using the MySQL connection, carry out an INSERT INTO for every symbol
with con:
cur = con.cursor()
cur.executemany(final_str, daily_data)
if __name__ == '__main__':
# Loop over the tickers and insert the daily historical data into the database
tickers = obtain_list_of_db_tickers()
for t in tickers:
print 'Adding data for %s' % t[1]
yf_data = get_daily_historic_data_yahoo(t[1])
insert_daily_data_into_db('1', t[0], yf_data) # I believe the error is here relating to the data vendor id # but am unclear on the method to solve the problem
ERROR CODE:
Adding data for ABT
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\Owner\Anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 585, in runfile
execfile(filename, namespace)
File "C:/Users/Owner/Documents/Python Scripts/price_retrieval_mine.py", line 99, in <module>
insert_daily_data_into_db('1', t[0], yf_data)
File "C:/Users/Owner/Documents/Python Scripts/price_retrieval_mine.py", line 91, in insert_daily_data_into_db
cur.executemany(final_str, daily_data)
File "C:\Users\Owner\Anaconda\lib\site-packages\MySQLdb\cursors.py", line 262, in executemany
r = self._query('\n'.join([query[:p], ',\n'.join(q), query[e:]]))
File "C:\Users\Owner\Anaconda\lib\site-packages\MySQLdb\cursors.py", line 354, in _query
rowcount = self._do_query(q)
File "C:\Users\Owner\Anaconda\lib\site-packages\MySQLdb\cursors.py", line 318, in _do_query
db.query(q)
_mysql_exceptions.IntegrityError: (1062, "Duplicate entry '1' for key 'data_vendor_id_UNIQUE'")
UPDATED output using SQL command: SHOW INDEXES FROM securities_master_00.daily_price
Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type
daily_price 0 PRIMARY 1 id A 2 NULL NULL BTREE
daily_price 0 data_vendor_id_UNIQUE 1 data_vendor_id A 2 NULL NULL BTREE
daily_price 0 symbol_id_UNIQUE 1 symbol_id A 2 NULL NULL BTREE
daily_price 1 index_data_vendor_id_idx 1 data_vendor_id A 2 NULL NULL BTREE
daily_price 1 index_symbol_id_idx 1 symbol_id A 2 NULL NULL BTREE
Upvotes: 0
Views: 1112
Reputation: 8615
As the SHOW INDEXES
statement result indicates, there are five indexes on your table, though only three are declared in your CREATE TABLE
statement. The two extra indexes are UNIQUE
indexes on your foreign key columns, which is a problem because you need to have a many-to-one relationship between the daily_price
table and either of the data_vendor
and symbol
tables. This reflects the fact that many prices will be generated by the same vendor and, over some period of time, for the same symbols.
You need to DROP
both of these extra indexes - or alternatively, DROP
the daily_price
table and recreate it using the table definition that you posted in this question - in order to stop throwing an IntegrityError
when you try to insert rows into the table.
Put more plainly, the data_vendor_id_UNIQUE
index on the table prevents you from ever having two rows in that table with the same data_vendor_id
. Since every row you insert with data from Yahoo has data_vendor_id = 1
, according to the last line of your Python code (presumably this corresponds with Yahoo's entry in the data_vendor
table), the second row you try to insert violates the unique constraint of that index and produces the error you see here.
It would be a good idea for you to try to figure out where these extra indexes came from, especially if you're working with someone else on this project or using someone else's code. It's possible there are less obvious problems hiding behind this error.
Finally, it will be well worth your time to learn about indexes, how they work, and when to use them, if you plan to do any serious work with MySQL. You should try to become familiar with statements like SHOW INDEXES
and especially EXPLAIN
when it comes to query execution plans, so that you can diagnose errors like this one quickly and easily.
Upvotes: 1