Reputation: 410
My Target is Hive table using Informatica ETL tool.
Updates are not supported in Hive earlier versions. So how should i do updates to records in this scenario. Is it ok to go for Hive update feature using Hive ACID and transaction feature.
Upvotes: 0
Views: 629
Reputation: 1527
Updates are not best option while working on Hive, creating intermediate temporary tables is better design. Steps to update existing Hive table-
Upvotes: 1
Reputation: 66
Informatica does supports Updates to hive tables from Informatica 9.6 HF3 version provided the tables support ACID, for more information you can refer to this link (https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions), but instead of doing this, I would rather do this a in a two step process,
1) Identify all the records which exist only in the target and the records which exist ONLY in the stage data 2) Merge these two and load them into a temporary table. 3) Finally re-name temporary table to the actual target table name
The above would work only SCD type 1 implementations
Upvotes: 1
Reputation: 712
You should look into event sourcing (https://msdn.microsoft.com/en-us/library/dn589792.aspx).
Think of your database as storing events instead of items. So if you have some counter
object you want to use in your database, instead of updating counter
from 0 to 1 to 2, etc, you simply insert a new document whenever you increment and then take the sum/count of those documents.
Upvotes: 0