Reputation: 577
i am trying to ingest csv data into Hive Database. for this purpose,
i tried with
listFile --> FetchFile --> ConvertCSVToAvro --> ConvertAvroToOrc --> PutHDFS
csv data is converted into ORC format and data is loading into HDFS. On top of this HDFS data, i can able to create hive external table.
now, i want to test with putHiveQL
Processor.
for this, i need to convert CSV data to AVRO to JSON?
ORC data can't be loaded directly into Hive?
if yes, we have to create Hive table manually or it creates automatically?
Upvotes: 1
Views: 2664
Reputation: 31490
We can create Hive table in NiFi flow itself.
ConvertAvroToOrc processor adds hive.ddl
attribute to the flowfles using that attribute we can create table in Hive using PutHiveQL processor.
listFile --> FetchFile --> ConvertCSVToAvro --> ConvertAvroToOrc --> PutHDFS -->
ReplaceText(Always replace with ${hive.ddl}) --> PutHiveQL
Refer to this i have explained in detail about the NiFi flow to create tables/partitions dynamically in hive.
HDFS
, then create table on top of the HDFS directory.SelectHiveQL
to read data from table and based
on the output format(csv,avro)
selected in processor results a
flowfile in that format.Upvotes: 3