Billie AK
Billie AK

Reputation: 65

What's the best approach to load teradata table data into a hive table using Nifi?

I'm new to Nifi so could you help me understand this platform and its capabilities. Would I be able to use a Nifi process to create a new table in Hive and move data into it weekly from a teradata database in the way I've defined below? How would I go about it? Not sure if I'm building a sensible flow.

Would the following process suffice: QueryDatabaseTable (and configure a pooling service for teradata and define a new tablename and schedule ingestion) --> PutHiveStreaming (create the table defined earlier) and then how do i pull the teradata schema into the new table?

Upvotes: 1

Views: 822

Answers (1)

notNull
notNull

Reputation: 31490

If you want to create new hive table along with the ingestion process then

Method1:

Using ConvertAvroToOrc processor adds hive.ddl(external table) attribute to the flowfile as we can use this attribute and execute using PutHiveQL processor then we are able to create table in hive.

If you want to create transactional table then needs to change the hive.ddl attribute.

Refer to this link for more details. If you wan to pull only the delta records from the source then you can use

ListDatabaseTables(list all tables from source db) + GenerateTableFetch(stores the state) Processors

Flow: enter image description here

Method2:

QuerydatabaseTable processor will result flowfile in Avro Format then you can use ExtractAvroMetaData processor to extract the avro schema by using some script we can create a new attribute with the required schema(i.e. managed/external/transactional table).

Upvotes: 2

Related Questions