Reputation: 759
By default, it appears that "s3 sync" doesn't create empty folders in the destination directory
aws s3 sync s3://source-bucket s3://dest-bucket --include "*" --recursive
I've searched for a ffew hours now, and can't seem to find anything to address empty folders/directories when using "sync" or "cp"
fwiw, i do see the following message that may pertain to the empty folders, but its hard to know for sure since the source bucket is pretty big and unwieldy.
Completed 4132 of 4132 part(s) with -5 file(s) remaining
Upvotes: 3
Views: 16334
Reputation: 14915
S3 has no concept of directories. S3 is an object store where each object is identified by a key. The key might be a string like "logs/2014/06/04/system.log"
Most graphical user interfaces on top of S3 (AWS CLI, AWS Console, Cloudberry, Transmit etc ...) interpret the "/" characters as a directory separator and present the file list "as is" it was in a directory structure.
However, internally, there is no concept of directory, S3 has a flat namespace. See https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-keys.html for more details.
Knowing this, I am not surprised that empty directories are not synced as there is no directories on S3
Upvotes: 6
Reputation: 26
The aws s3 sync
command does not create empty directories in the destination. This is how I created the empty directories in the destination S3 bucket to match with the source S3 bucket(You can run this in CloudShell / Bash shell):
# Export variables (Please modify variable values as appropriate)
source_bucket_name="SOURCE_S3_BUCKET_NAME"
destination_bucket_name="DESTINATION_S3_BUCKET_NAME"
input_file="empty-directories.txt"
# Export the list of empty directories to a file called "empty-directories.txt"
aws s3api list-objects-v2 --bucket $source_bucket_name --query 'Contents[].{Key: Key}' | jq -r '.[] | select(.Key | endswith("/")) | .Key' > empty-directories.txt
# Create empty directories in the destination S3 bucket
while IFS= read -r folder_name || [ -n "$folder_name" ]; do
# Ensure the folder name ends with a trailing slash
folder_key="${folder_name%/}/"
# Create the directory in the destination bucket
aws s3api put-object --bucket "$destination_bucket_name" --key "$folder_key"
echo "Created folder: $folder_key in bucket: $destination_bucket_name"
done < "$input_file"
Upvotes: 0
Reputation: 11
This is an old question but it is still relevant so I will answer it.
I've come across this problem and I solved it with creating a empty file inside the folder named .folderkeep which has no data in it. So when you sync it it will create the folder.
It is just to exist in a folder. It has nothing to do with the other data. And also it starts with ".", so it will be hidden to the file system.
Upvotes: 1
Reputation: 14508
There is no official way yet.
You can use S3cmd instead of the official AWS client, I have read it supports syncing empty directories.
Alternatively, you could use bash to add a file to empty directories:
find . -type d -empty -exec touch {}/empty.txt \;
Off-topic but related: I don't want symlinked files to be duplicated (the default when the linked file is local) but I do want to keep the structure (--no-follow-symlinks just ignores the link). So a one-liner to copy the link into a text file:
find . -type l -exec bash -c 'readlink "$1" > "$1.symlink"' _ {} \;
Upvotes: 1
Reputation: 4333
No way so far but a feature-request is open to add the functionality to copy also empty dirs.
https://github.com/aws/aws-cli/issues/912
Upvotes: 4