Pedro Neves
Pedro Neves

Reputation: 374

Is it possible to import a table with sqoop and add an extra timestamp column?

is it possible to use the sqoop command "import table" to import a table from an oracle database to an Hadoop cluster and add an extra column with the current timestamp (for troubleshouting purposes)? so far, I have the following command:

sqoop import -Dorg.apache.sqoop.splitter.allow_text_splitter=true --connect jdbc:oracle:thin:@//MY_ORACLE_SERVER --username USERNAME --password PASSWORD --target-dir /MyDIR --fields-terminated-by '\b' --table SOURCE_TABLE --hive-table DESTINATION_TABLE --hive-import --hive-overwrite --hive-delims-replacement '<newline>'

I would like to add a timestamp column to the table so that I know when that data was loaded. Is it possible?

Thanks in advance

Upvotes: 0

Views: 1110

Answers (2)

Manish Pansari
Manish Pansari

Reputation: 401

You can create a temp hive table by using sqoop ,after that create a new hive table by using old one with extra required columns.

Upvotes: 1

Christian Arias
Christian Arias

Reputation: 61

you can use the free-form query import instead of table import, and call the timestamp function :

sqoop import -Dorg.apache.sqoop.splitter.allow_text_splitter=true --connect jdbc:oracle:thin:@//MY_ORACLE_SERVER --username USERNAME --password PASSWORD --target-dir /MyDIR --fields-terminated-by '\b' ----query 'SELECT a.*,systimestamp FROM SOURCE_TABLE a' --hive-table DESTINATION_TABLE  --hive-import --hive-overwrite --hive-delims-replacement '<newline>'

Maybe you could use sysdate instead systimestamp (smaller datatype but less precision)

Upvotes: 1

Related Questions