TheMonkWhoSoldHisCode
TheMonkWhoSoldHisCode

Reputation: 2332

Data manipulation in a mysql table

I have 2 columns in the table which have incorrect entries. The size of the table runs in to billions of records. I had like to swap data between two columns (c1 and C2) The approach taken is to export the data in small chunks in to CSV files and then import it back with corrected entries. For example, below is the data set

--------
|C1 | C2 |
|3  | 4  |
|4  | 6  |

I would then export the data in to a semicolon delimted CSV file (complete command NOT shown) as below

SELECT C2,C1 FROM TABLE temp INTO OUTFILE /tmp/test.csv 

The output of such command would be

4;3
6;4

When I import back the data (after deleting the data in question), the data will be corrected as follows

  | C1  C2 |
  | 3   4  |
  | 4   6  |

It is really a matter of OUTFILE and INFILE operation, I believe

Question

  1. Does the approach makes sense? The real data also expects NULL, int values in some of the columns apart from data swaps.
  2. The other complexity is in the production database, I will need to use the WHERE clause. The table name would also be fetched dynamically.
  3. With reference to point 2, how do I add dynamicity to the queries. Should I use a STORED procedure or SHELL SCRIPT? STORED Procedure does not seem to support LOAD DATA INFILE functionality.
  4. If I am left with shell, any sample script that I can reuse? The CSV filename, table name and WHERE clause will have to be built at run time.
  5. Also the size of the chunk to be exported and imported will be calculated dynamically.
  6. Any other approach?

Note - This is a INFOBRIGHT column based table on top of mysql. The UPDATE query is non-performant and ALTER TABLE is not supported by INFOBRIGHT.

Upvotes: -2

Views: 250

Answers (1)

Diego
Diego

Reputation: 669

you can use this approach create a temporary table 'temp_table' and the use this procedure, then call it from anywhere with the name of the table like this.

call change_fields('table_origen');

the procedure works like this

  1. delete the data in the temp_table
  2. insert into this temp_table with the order changed from the table_origen
  3. delete from the table_origen
  4. insert into the table_origen the information from the temp_table

You can go crazy and how many variables you want to accept

CREATE DEFINER=`root`@`%` PROCEDURE `change_fields`(IN `tableName` VARCHAR(200))
    LANGUAGE SQL
    NOT DETERMINISTIC
    CONTAINS SQL
    SQL SECURITY DEFINER
    COMMENT ''
BEGIN

    SET @hr1 = CONCAT('Delete from temp_table;');

    PREPARE hrStmt1 FROM @hr1;
    EXECUTE hrStmt1;
    DEALLOCATE PREPARE hrStmt1;
    SET @hr1 = CONCAT('insert into temp_table (C_1, C_2) Select C2, C1 from  `',tableName,'`;');

    PREPARE hrStmt1 FROM @hr1;
    EXECUTE hrStmt1;
    DEALLOCATE PREPARE hrStmt1;

    SET @hr1 = CONCAT('Delete from `',tableName,'`;');

    PREPARE hrStmt1 FROM @hr1;
    EXECUTE hrStmt1;
    DEALLOCATE PREPARE hrStmt1;

    SET @hr1 = CONCAT('insert into `',tableName,'` (C1,C2) select C_1, C_2 from  temp_table;');

    PREPARE hrStmt1 FROM @hr1;
    EXECUTE hrStmt1;
    DEALLOCATE PREPARE hrStmt1;

END;

Upvotes: 0

Related Questions