Reputation: 463
I would like to add files of 2 directories under a single directory, also maintaining the directory structure.
i have directory 1 and directory 2, each with approx 80 subdirectories structured in the way shown below.
Directory 1 on HDFS:
Directory 2 on HDFS:
i want to combine file11 of dir 1 and file 26 of dir 2 to be present under a single directory, file 13 of dir 1 and dir 27 and so on. Destination directory is directory 1.
The files that are to the added from dir 2 to dir 1 should match the path of subdirectory.
Desired Output:
Any help is appreciated.
Upvotes: 0
Views: 903
Reputation: 463
I'm getting a unique file name for each file under directory 2 and adding to the right subdirectory under dir 1. Following is the script:
for file in $(hadoop fs -ls /user/hadoop/2/* | grep -o -e "/user/hadoop/2/.*") ; do
subDir=$(echo $file | cut -d '/' -f 5)
fileName=$(echo $file | cut -d '/' -f 6)
uuid=$(uuidgen)
newFileName=$fileName"_"$uuid
hadoop fs -cp $file /user/hadoop/1/$subDir/$newFileName
done
Upvotes: 1
Reputation: 29155
Use org.apache.hadoop.fs.FileUtil
API
You get FileSystem
with below API
final FileSystem fs = FileSystem.get(conf);
copy
public static boolean copy(FileSystem srcFS, Path[] srcs, FileSystem dstFS, Path dst, boolean deleteSource, boolean overwrite, Configuration conf) throws IOException Throws: IOException
This method Copy files between FileSystems.
FileUtil.replaceFile(File src, File target)
should also worksee documentation for this method "Move the src file to the name specified by target."
In either case, you need to list your common folder /user/hadoop/2/abc/ /user/hadoop/1/abc/ by comparing after third slash character and if they are matching copy source to destination or develop logic according to your requirement(this I will leave it to you :-))
after copying to desired target : you can see them on the flow with below example method
/**
* Method listFileStats.
*
* @param destination
* @param fs
* @throws FileNotFoundException
* @throws IOException
*/
public static void listFileStats(final String destination, final FileSystem fs) throws FileNotFoundException, IOException {
final FileStatus[] statuss = fs.listStatus(new Path(destination));
for (final FileStatus status : statuss) {
///below log are sl4j you can use other loggers..
LOG.info("-- status {} ", status.toString());
}
}
Upvotes: 0