Reputation: 759

How to include empty folders in "s3 sync"?

By default, it appears that "s3 sync" doesn't create empty folders in the destination directory

aws s3 sync s3://source-bucket s3://dest-bucket --include "*" --recursive

I've searched for a ffew hours now, and can't seem to find anything to address empty folders/directories when using "sync" or "cp"

fwiw, i do see the following message that may pertain to the empty folders, but its hard to know for sure since the source bucket is pretty big and unwieldy.

Completed 4132 of 4132 part(s) with -5 file(s) remaining

Upvotes: 3

Answers (5)

Sébastien Stormacq

Reputation: 14915

S3 has no concept of directories. S3 is an object store where each object is identified by a key. The key might be a string like "logs/2014/06/04/system.log"

Most graphical user interfaces on top of S3 (AWS CLI, AWS Console, Cloudberry, Transmit etc ...) interpret the "/" characters as a directory separator and present the file list "as is" it was in a directory structure.

However, internally, there is no concept of directory, S3 has a flat namespace. See https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-keys.html for more details.

Knowing this, I am not surprised that empty directories are not synced as there is no directories on S3

Upvotes: 6

gimhan

Reputation: 26

The aws s3 sync command does not create empty directories in the destination. This is how I created the empty directories in the destination S3 bucket to match with the source S3 bucket(You can run this in CloudShell / Bash shell):

# Export variables (Please modify variable values as appropriate)

source_bucket_name="SOURCE_S3_BUCKET_NAME"

destination_bucket_name="DESTINATION_S3_BUCKET_NAME"

input_file="empty-directories.txt"

# Export the list of empty directories to a file called "empty-directories.txt"

aws s3api list-objects-v2 --bucket $source_bucket_name --query 'Contents[].{Key: Key}' | jq -r '.[] | select(.Key | endswith("/")) | .Key' > empty-directories.txt

# Create empty directories in the destination S3 bucket

while IFS= read -r folder_name || [ -n "$folder_name" ]; do
    # Ensure the folder name ends with a trailing slash
    folder_key="${folder_name%/}/"
    
    # Create the directory in the destination bucket
    aws s3api put-object --bucket "$destination_bucket_name" --key "$folder_key"
    
    echo "Created folder: $folder_key in bucket: $destination_bucket_name"
done < "$input_file"

Upvotes: 0

Gökhan Koçyiğit

Reputation: 11

This is an old question but it is still relevant so I will answer it.

I've come across this problem and I solved it with creating a empty file inside the folder named .folderkeep which has no data in it. So when you sync it it will create the folder.

It is just to exist in a folder. It has nothing to do with the other data. And also it starts with ".", so it will be hidden to the file system.

Upvotes: 1

SamGoody

Reputation: 14508

There is no official way yet.

You can use S3cmd instead of the official AWS client, I have read it supports syncing empty directories.

Alternatively, you could use bash to add a file to empty directories:

find . -type d -empty -exec touch {}/empty.txt \;

Off-topic but related: I don't want symlinked files to be duplicated (the default when the linked file is local) but I do want to keep the structure (--no-follow-symlinks just ignores the link). So a one-liner to copy the link into a text file:

find . -type l -exec bash -c 'readlink "$1" > "$1.symlink"' _ {} \;

Upvotes: 1

Zioalex

Reputation: 4333

No way so far but a feature-request is open to add the functionality to copy also empty dirs.

https://github.com/aws/aws-cli/issues/912

Upvotes: 4

How to include empty folders in &quot;s3 sync&quot;?

Answers (5)

Related Questions

How to include empty folders in "s3 sync"?