Reputation: 6475
Is there any way to change the content of the file associated as the distributedCache when the job is done and use as a new distributedCache in another map/reduce job coming after?
Upvotes: 0
Views: 91
Reputation: 33495
Check TrackerDistributedCacheManager.java code for more details. Hadoop keeps a reference count on how many tasks are using the files in the DistributedCache. If the count drops to 0, then the file marked for deletion. So, at the end of the job the files in the DistributedCache are cleaned or else they would keep on piling on the node across jobs.
So, you can't change the files in the distributed cache and use it in the consecutive job.
Upvotes: 2