Prem Sompura
Prem Sompura

Reputation: 671

Get entire bucket or more than one objects from AWS S3 bucket through Ansible

As far as I know S3 module of Ansible, it can only get an object at once.

My question is that what if I want to download/get entire bucket or more than one object from S3 bucket at once. Is there any hack?

Upvotes: 9

Views: 14127

Answers (8)

ThorSummoner
ThorSummoner

Reputation: 18149

I was able to achieve it like so:

  - name: get s3_bucket_items
    s3:
      mode=list
      bucket=MY_BUCKET
      prefix=MY_PREFIX/
    register: s3_bucket_items

  - name: download s3_bucket_items
    s3:
      mode=get
      bucket=MY_BUCKET
      object={{ item }}
      dest=/tmp/
    with_items: s3_bucket_items.s3_keys

Notes:

  • Your prefix should not have a leading slash.
  • The {{ item }} value will have the prefix already.

Upvotes: 10

Nissanka
Nissanka

Reputation: 123

You have to first list files to a variable and copy files using that variable.

- name: List files
  aws_s3: 
    aws_access_key: 'YOUR_KEY'
    aws_secret_key: 'YOUR_SECRET'
    mode: list
    bucket: 'YOUR_BUCKET'
    prefix : 'YOUR_BUCKET_FOLDER' #Remember to add trailing slashes
    marker: 'YOUR_BUCKET_FOLDER' #Remember to add trailing slashes
  register: 's3BucketItems'

- name: Copy files
  aws_s3:
    aws_access_key: 'YOUR_KEY'
    aws_secret_key: 'YOUR_SECRET'
    bucket: 'YOUR_BUCKET'
    object: '{{ item }}'
    dest: 'YOUR_DESTINATION_FOLDER/{{ item|basename }}'
    mode: get
  with_items: '{{s3BucketItems.s3_keys}}'

Upvotes: 4

mva
mva

Reputation: 534

the non-ansible solution, but finally got it working on the instance running with an assumed role with S3 bucket access, or AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables

---
- name: download fs s3 bucket
  command: aws s3 sync s3://{{ s3_backup_bucket }} {{ dst_path }}

Upvotes: 1

numa08
numa08

Reputation: 197

It will be able to:

- name: Get s3 objects
  s3:
    bucket: your-s3-bucket
    prefix: your-object-directory-path
    mode: list
  register: s3_object_list

- name: Create download directory
  file:
    path: "/your/destination/directory/path/{{ item | dirname }}"
    state: directory
  with_items:
      - "{{ s3_object_list.s3_keys }}"

- name: Download s3 objects
  s3:
    bucket: your-s3-bucket
    object: "{{ item }}"
    mode: get
    dest: "/your/destination/directory/path/{{ item }}"
  with_items:
    - "{{ s3_object_list.s3_keys }}" 

Upvotes: 3

Pawel Szmuc
Pawel Szmuc

Reputation: 1

Maybe you could change your "with_items", then should work

  - name: get list to download
    aws_s3:
      region: "{{ region }}"
      bucket: "{{ item }}"
      mode: list
    with_items: "{{ s3_bucketlist }}"
    register: s3_bucket_items

but maybe fast is:

   - name: Sync directory from S3 to disk
    command: "aws --region {{ region }} s3 sync s3://{{ bucket }}/ /tmp/test"

Upvotes: 0

Norm1710
Norm1710

Reputation: 13

The following code will list every file in every S3 bucket in the account. It is run as a role with a group_vars/localhost/vault.yml containing the AWS keys.

I still haven't found out why the second, more straight-forward method II doesn't work but maybe someone can enlighten us.

- name: List S3 Buckets
  aws_s3_bucket_facts:
    aws_access_key: "{{ aws_access_key_id }}"
    aws_secret_key: "{{ aws_secret_access_key }}"
#    region: "eu-west-2"
  register: s3_buckets

#- debug: var=s3_buckets

- name: Iterate buckets
  set_fact:
    app_item: "{{ item.name }}"
  with_items: "{{ s3_buckets.ansible_facts.buckets }}"
  register: app_result

#- debug: var=app_result.results  #.item.name <= does not work??

- name: Create Fact List
  set_fact:
    s3_bucketlist: "{{ app_result.results | map(attribute='item.name') | list }}"

#- debug: var=s3_bucketlist

- name: List S3 Bucket files - Method I - works
  local_action:
    module: aws_s3
    bucket: "{{ item }}"
    aws_access_key: "{{ aws_access_key_id }}"
    aws_secret_key: "{{ aws_secret_access_key }}"
    mode: list
  with_items:
    - "{{ s3_bucketlist }}"
  register: s3_list_I

#- debug: var=s3_list_I

- name: List S3 Bucket files - Method II - does not work
  aws_s3:
    aws_access_key: "{{ aws_access_key_id }}"
    aws_secret_key: "{{ aws_secret_access_key }}"
    bucket: "{{ item }}"
    mode: list
    with_items: "{{ s3_bucketlist }}"
  register: s3_list_II

Upvotes: 0

M. Glatki
M. Glatki

Reputation: 787

The ansible S3 module has currently no built-in way to syncronize buckets to disk recursively.

In theory, you could try to collect the keys to download with a

- name: register keys for syncronization
  s3:     
    mode: list
    bucket: hosts
    object: /data/*
  register: s3_bucket_items

- name: sync s3 bucket to disk
  s3:
    mode=get
    bucket=hosts
    object={{ item }}
    dest=/etc/data/conf/
  with_items: s3_bucket_items.s3_keys

While I often see this solution, it does not seem to work with current ansible/boto versions, due to a bug with nested S3 'directories' (see this bug report for more information), and the ansible S3 module not creating subdirectories for keys. I believe it is also possible that you would run into some memory issues using this method when syncing very large buckets.

I also like to add that you most likely do not want to use credentials coded into your playbooks - I suggest you use IAM EC2 instance profiles instead, which are much more secure and comfortable.

A solution that works for me, would be this:

- name: Sync directory from S3 to disk
  command: "s3cmd sync -q --no-preserve s3://hosts/{{ item }}/ /etc/data/conf/"
  with_items:
    - data

Upvotes: 3

Bruce P
Bruce P

Reputation: 20759

As of Ansible 2.0 the S3 module includes the list action, which lets you list the keys in a bucket.

If you're not ready to upgrade to Ansible 2.0 yet then another approach might be to use a tool like s3cmd and invoke it via the command module:

- name: Get objects
  command: s3cmd ls s3://my-bucket/path/to/objects
  register: s3objects

Upvotes: 1

Related Questions