DangerPaws
DangerPaws

Reputation: 866

CSV data import and data processing

I'm having to import, on a very regular basis, data from a CSV into a MySQL database.

LOAD DATA LOCAL INFILE '/path/to/file.csv' INTO TABLE `tablename` FIELDS TERMINATED BY ','

The data I'm importing doesn't have a primary key column and equally I can't alter the structure of the CSV file as I have no control over it.

So I need to import this CSV data into a temporary MySQL table which is fine but then I need to take this data and process it line by line. As each row is run through a process, I need to delete that row from the temporary table so that I don't re-process it.

Because the temporary table has no primary key I can't do DELETE FROM tablename WHERE id=X which would be the best option, instead I have to match against a bunch of alphanumeric columns (probably up to 5 in order to avoid accidentally deleting a duplicate).

Alternatively I was thinking I could alter the table AFTER the CSV import process was complete and add a primary key column, then process the data as previously explained. Then when complete, alter the table again to remove the primary key column ready for a new import. Can someone please tell if this is a stupid idea or not? What would be most efficient and quick?

Any ideas or suggestions greatly appreciated!

Upvotes: 0

Views: 228

Answers (1)

peterm
peterm

Reputation: 92845

You can have an auto_increment column in your temporary table from the beginning and populate values as you load data

CREATE TEMPORARY TABLE tablename
(
  id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
  col1 INT,
  col2 VARCHAR(32),
  col3 INT,
  ...
);

Then specify all columns in parentheses, but leave id out

LOAD DATA LOCAL INFILE '/path/to/file.csv' 
INTO TABLE `tablename` 
FIELDS TERMINATED BY ','
(col1, col2, col3,...); -- specify all columns, but leave id out

That way you don't need to add and remove id column before and after import. Since you're doing import on a regular basis you can consider to use a permanent table instead of temporary one and just TRUNCATE it after you done with the import to clear the table and reset id column.

Upvotes: 3

Related Questions