Mehdi TAZI
Mehdi TAZI

Reputation: 575

insert data in HDFS using Hive

let's assume that we have an external Hive table pointing on CSVs files on an hdfs directory.

so what happened when inserting a new line on this table using hive :

  1. will the insert cause a whole rewrite of the table ?
  2. or a a whole rewrite of the hdfs block where the data is located in ?
  3. or will simply append the new line at the end of the file ?

Same question for the update operation

thanks in advance !

Upvotes: 0

Views: 676

Answers (1)

venkata
venkata

Reputation: 477

Answering your question, thinking that you are using an insert statement and not using INSERT OVERWRITE with files.

  1. No, insert will create a new file with the data you have inserted
  2. No, only new file will be inserted
  3. No appends are done to the existing files

Even if you use INSERT INTO and insert some files, then those new files will come and sit in the particular directory in HDFS without impacting existing files.

If you are using INSERT OVERWRITE all the files present in the directory of the given table will be deleted and new files will be placed in that directory.

Upvotes: 1

Related Questions