Reputation: 840
Merging multiple files in one single file isn't an issue in unix. I however wanted to combine multiple files into fewer files and limit the formation of these multiple files based on size.
Here's full explanation: 1) There are 200 files of varying sizes ranging from 1KB to 2 GB. 2) I wish to combine multiple files at random and create multiple files of 5 GB each. 3) So if there are 200 files ranging from 1KB to 2GB per file, the resultant set might be 10 files of 5GB each.
Below is the approach I'm trying to make but couldn't devise the logic, need some assistance:
for i in ls /tempDir/``
do
if [[ -r $i ]]
then
for files in find /tempDir/ -size +2G``
cat $files > combinedFile.csv
fi
done
This will only create one file combinedFile.csv
whatever the size may be. But I need to limit the size of combinedFile.csv
to 5GB and create multiple files combinedFile_1.csv combinedFile_2.csv, etc
.
Also, I would also like to check that when these multiple merged files are created, the rows aren't broken in multiple files.
Any ideas how to achieve it?
Upvotes: 1
Views: 843
Reputation: 840
I managed a workaround with cat
ing then splitting the files with the code below:
for files in `find ${dir}/ -size +0c -type f`
do
if [[ -r $files ]]
then
cat $files >> ${workingDirTemp}/${fileName}
else
echo "Corrupt Files"
exit 1
fi
done
cd ${workingDir}
split --line-bytes=${finalFileSize} ${fileName} --numeric-suffixes -e --additional-suffix=.csv ${unserInputFileName}_
cat
is a CPU intensive operation for big files like 10+Gigs. Does anyone have any solution that could reduce the CPU load or increase the processing speed?
Upvotes: 1