Vijay Kansal
Vijay Kansal

Reputation: 839

Copy Multiple files from HDFS to local: Multithreading?

In my Java application, I need to copy multiple files from HDFS to Local File System.

Which of the below two approaches will be faster ? 1. Sequentially copy files one-by-one 2. Run parallel threads to copy each file.

Upvotes: 0

Views: 1767

Answers (1)

NESPowerGlove
NESPowerGlove

Reputation: 5496

If you have one physical disk as part of your local file system than a sequential approach would be best, as a parallel approach would cause the disk (in the case of a hard drive) to spin back and forth unnecessarily (depending on how much the OS can help you or not and the nature of the writes), and also because you would only have one physical resource to work with at a time, so one thread would be good enough.

If this local file system has multiple physical disks, then the possibility of running parallel threads for more performance could be ideal (like Thread A writes all files that are going to drive C, while thread B writes all files that are going to drive D).

Upvotes: 1

Related Questions