Shashi
Shashi

Reputation: 2714

Writing small files in HDFS

I know it sound silly and understand hadoop is not meant for small files but unfortunately i have received 6000+ small files each of around 50kb.

Everytime i try to run "hadoop fs -put -f /path/FOLDER_WITH_FILES /target/HDSF_FOLDER" it always fails for one the random file while making connection with namenode.

java.net.SocketTimeoutException: 75000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel

I was wondering if there any better approach to write small in HDFS.

Thanks

Upvotes: 0

Views: 266

Answers (1)

Rohit Nimmala
Rohit Nimmala

Reputation: 1539

It is always advisable to merge all your small files into hadoop sequence file, and process it. It will give you performance gain.

Upvotes: 0

Related Questions