Reputation: 671
As far as I know S3 module of Ansible, it can only get an object at once.
My question is that what if I want to download/get entire bucket or more than one object from S3 bucket at once. Is there any hack?
Upvotes: 9
Views: 14127
Reputation: 18149
I was able to achieve it like so:
- name: get s3_bucket_items
s3:
mode=list
bucket=MY_BUCKET
prefix=MY_PREFIX/
register: s3_bucket_items
- name: download s3_bucket_items
s3:
mode=get
bucket=MY_BUCKET
object={{ item }}
dest=/tmp/
with_items: s3_bucket_items.s3_keys
Notes:
{{ item }}
value will have the prefix already.Upvotes: 10
Reputation: 123
You have to first list files to a variable and copy files using that variable.
- name: List files
aws_s3:
aws_access_key: 'YOUR_KEY'
aws_secret_key: 'YOUR_SECRET'
mode: list
bucket: 'YOUR_BUCKET'
prefix : 'YOUR_BUCKET_FOLDER' #Remember to add trailing slashes
marker: 'YOUR_BUCKET_FOLDER' #Remember to add trailing slashes
register: 's3BucketItems'
- name: Copy files
aws_s3:
aws_access_key: 'YOUR_KEY'
aws_secret_key: 'YOUR_SECRET'
bucket: 'YOUR_BUCKET'
object: '{{ item }}'
dest: 'YOUR_DESTINATION_FOLDER/{{ item|basename }}'
mode: get
with_items: '{{s3BucketItems.s3_keys}}'
Upvotes: 4
Reputation: 534
the non-ansible solution, but finally got it working on the instance running with an assumed role with S3 bucket access, or AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
environment variables
---
- name: download fs s3 bucket
command: aws s3 sync s3://{{ s3_backup_bucket }} {{ dst_path }}
Upvotes: 1
Reputation: 197
It will be able to:
- name: Get s3 objects
s3:
bucket: your-s3-bucket
prefix: your-object-directory-path
mode: list
register: s3_object_list
- name: Create download directory
file:
path: "/your/destination/directory/path/{{ item | dirname }}"
state: directory
with_items:
- "{{ s3_object_list.s3_keys }}"
- name: Download s3 objects
s3:
bucket: your-s3-bucket
object: "{{ item }}"
mode: get
dest: "/your/destination/directory/path/{{ item }}"
with_items:
- "{{ s3_object_list.s3_keys }}"
Upvotes: 3
Reputation: 1
Maybe you could change your "with_items", then should work
- name: get list to download aws_s3: region: "{{ region }}" bucket: "{{ item }}" mode: list with_items: "{{ s3_bucketlist }}" register: s3_bucket_items
but maybe fast is:
- name: Sync directory from S3 to disk command: "aws --region {{ region }} s3 sync s3://{{ bucket }}/ /tmp/test"
Upvotes: 0
Reputation: 13
The following code will list every file in every S3 bucket in the account. It is run as a role with a group_vars/localhost/vault.yml containing the AWS keys.
I still haven't found out why the second, more straight-forward method II doesn't work but maybe someone can enlighten us.
- name: List S3 Buckets
aws_s3_bucket_facts:
aws_access_key: "{{ aws_access_key_id }}"
aws_secret_key: "{{ aws_secret_access_key }}"
# region: "eu-west-2"
register: s3_buckets
#- debug: var=s3_buckets
- name: Iterate buckets
set_fact:
app_item: "{{ item.name }}"
with_items: "{{ s3_buckets.ansible_facts.buckets }}"
register: app_result
#- debug: var=app_result.results #.item.name <= does not work??
- name: Create Fact List
set_fact:
s3_bucketlist: "{{ app_result.results | map(attribute='item.name') | list }}"
#- debug: var=s3_bucketlist
- name: List S3 Bucket files - Method I - works
local_action:
module: aws_s3
bucket: "{{ item }}"
aws_access_key: "{{ aws_access_key_id }}"
aws_secret_key: "{{ aws_secret_access_key }}"
mode: list
with_items:
- "{{ s3_bucketlist }}"
register: s3_list_I
#- debug: var=s3_list_I
- name: List S3 Bucket files - Method II - does not work
aws_s3:
aws_access_key: "{{ aws_access_key_id }}"
aws_secret_key: "{{ aws_secret_access_key }}"
bucket: "{{ item }}"
mode: list
with_items: "{{ s3_bucketlist }}"
register: s3_list_II
Upvotes: 0
Reputation: 787
The ansible S3 module has currently no built-in way to syncronize buckets to disk recursively.
In theory, you could try to collect the keys to download with a
- name: register keys for syncronization
s3:
mode: list
bucket: hosts
object: /data/*
register: s3_bucket_items
- name: sync s3 bucket to disk
s3:
mode=get
bucket=hosts
object={{ item }}
dest=/etc/data/conf/
with_items: s3_bucket_items.s3_keys
While I often see this solution, it does not seem to work with current ansible/boto versions, due to a bug with nested S3 'directories' (see this bug report for more information), and the ansible S3 module not creating subdirectories for keys. I believe it is also possible that you would run into some memory issues using this method when syncing very large buckets.
I also like to add that you most likely do not want to use credentials coded into your playbooks - I suggest you use IAM EC2 instance profiles instead, which are much more secure and comfortable.
A solution that works for me, would be this:
- name: Sync directory from S3 to disk
command: "s3cmd sync -q --no-preserve s3://hosts/{{ item }}/ /etc/data/conf/"
with_items:
- data
Upvotes: 3
Reputation: 20759
As of Ansible 2.0 the S3 module includes the list
action, which lets you list the keys in a bucket.
If you're not ready to upgrade to Ansible 2.0 yet then another approach might be to use a tool like s3cmd and invoke it via the command module:
- name: Get objects
command: s3cmd ls s3://my-bucket/path/to/objects
register: s3objects
Upvotes: 1