Vijay_Shinde
Vijay_Shinde

Reputation: 1352

how to delete some data from hdfs file in Hadoop

I'd uploaded 50GB data on Hadoop cluster. But Now i want to delete first row of data file. This is time consuming if i remove that data & change manually. Then upload it again on HDFS. Please reply me.

Upvotes: 2

Views: 3599

Answers (1)

Remus Rusanu
Remus Rusanu

Reputation: 294247

HDFS files are immutable (for all practical purposes).

You need to upload the modified file(s). You can do the change programatically with a M/R job that does a near-identity transformation, eg. running a streaming shell script that does sed, but the gist of it that you need to create new files, HDFS files cannot be edited.

Upvotes: 3

Related Questions