Michal
Michal

Reputation: 1905

Opening a file stored in HDFS to edit in VI

I would like to edit a text file directly in HDFS using VI without having to copy it to local, edit it and then copy it back from local. Is this possible?

Edit: This used to be possible in Cloudera's Hue UI but is no longer the case.

Upvotes: 10

Views: 24156

Answers (5)

Uri Goren
Uri Goren

Reputation: 13700

A simple way is to copy from and to hdfs, and edit locally (See here)

hvim <filename>

Source code of hvim

hadoop fs -text $1>hvim.txt
vim hvim.txt
hadoop fs -rm -skipTrash $1
hadoop fs -copyFromLocal hvim.txt $1
rm hvim.txt

Upvotes: 3

Ashrith
Ashrith

Reputation: 6855

There are couple of options that you could try, which allows you to mount HDFS to your local machine and then you could use your local system commands like cp, rm, cat, mv, mkdir, rmdir, more, etc. But neither of them supports random write operations but supports append operations.

NFS Gateway uses NFS V3 and support appending to file but could not perform random write operations.

And regarding your comment on hue, maybe Hue is downloading the file to a local buffer and after editing it might be replacing the original file in HDFS.

Upvotes: 8

Tagar
Tagar

Reputation: 14939

Other answers here are correct, you can't edit files in HDFS as it is not a POSIX-compliant filesystem. Only appends are possible.

Although recently I had to fix a header in a hdfs file, and that's best I came up with..

sc.textFile(orig_file).map(fix_header).coalesce(1).saveAsTextFile(orig_file +'_fixed')

This is a Spark (PySpark) code. Notice coalesce(1) so the job is not .. parallel but benefit is that you get only one output file. So then just move/rename file from "orig_file +'_fixed'" directory to overwrite original file.

ps. You could omit .coalesce(1) part and the conversion will run in parallel (assuming big file/multiple splits) and will be much faster, but then you'll have to merge output hdfs files into one.

pps. "map" call in the pipeline fixes the headers through "fix_header" function (not shown here for clarity).

Upvotes: 0

deeksha
deeksha

Reputation: 11

File in HDFS can be replaced using the -f option in hadoop fs -put -f This will eliminate the need to delete and then copy.

Upvotes: 1

Suman
Suman

Reputation: 482

File in HDFS can't be edit directly.Even you can't replace the file in HDFS. only way can delete the file and update the same with new one.

Edit the file in local and copy it again in HDFS. Don't forget to delete the old file if you want to keep same name.

Upvotes: 0

Related Questions