Bbb
Bbb

Reputation: 149

S3 high level design

I'm trying to figure out how to deal with S3 (without any training from my organisation - yes this sucks but that's where I am :( ) and am a bit overwhelmed with choices about what I'm trying to do.

I want to provide a simple archive storage service (preferably just using the AWS web gui) to hundreds of internal customer groups (ie. clients) in my large organisation (~100,000 staff)... My question comes down to how do I structure buckets and keys so that:

a) I don't hit the 100 bucket limit (I know I can ask for more, but would AWS let me increase this to say 500 on request)?

b) I can have multiple individually identifiable users for each client who can control the data for that client only and only put/get data for their client.

c) I can bill each client for their usage, preferably with whatever automated report AWS can provide... or do I need to script this with the API?

  1. Do I go for hundreds of buckets? One bucket with a folder per client?

  2. If one bucket, can I please get some pointers for how I provide different users access to different folders within one bucket. Been looking at groups, roles, policies etc and have my head in a twist and not sure which way I'm mean to go...

Is there some better documentation such as a book that would provide better documentation than AWS's online help?

Thanks....

Upvotes: 1

Views: 274

Answers (1)

BraveNewCurrency
BraveNewCurrency

Reputation: 13065

without any training from my organisation

Your org is large enough that you should be able to get time with an AWS Architect (who works for AWS) - they really know their stuff and can give definitive answers to questions like this.

I want to provide a simple archive storage service

OK. You want to be thinking of Lifecycle Policies, Glacier, Infrequent Access and Reduced Redundancy too.

(preferably just using the AWS web gui)

Hmm, the AWS GUI isn't the best. I'm sure there are many better ways. (Such as FireFox/Chrome extensions, etc.). You also need to be the expert on S3 sync programs. Many of them don't scale. (Hint - don't use s3fs type programs, they can be hard to debug.)

to hundreds of internal customer groups

I can think of 3 ways:

1) One account, one (or more) IAM users per department (possibly via user security integration). One directory per department. You can set up the permissions on an IAM user such that they can only 'see' files in a single directory in a single bucket. The big downside is that you get one bill from AWS, and no easy way to know who did what.

2) One Bucket per department. You may need N/100 accounts if AWS can't raise the limit (not sure, just ask them. They don't bite.). Each IAM user can be restricted to just their bucket. You can create trust for an admin user between the accounts so you can manage it easily. If you tag your buckets, you can get per-tag bills.

3) Many accounts - one per department. They can manage their own IAM users. Each department can create many buckets. You can set up trust between one central account so you can admin everything with one IAM user. They each get their own bill from AWS.

Frankly, #1 is a little tricky if you want accurate billing. #2 will require extra of work from you, but could be made to work. #3 is the cleanest.

I do worry that the structure you have defined is "not the best". i.e. departments will argue over who stores what ("I sent you that report, then I deleted it!"), instead of optimizing globally. ("let's store 1 copy of the report."). In small organizations, #1 is best. In large ones, I imagine #3 is best, with you being merely an internal consultant to help departments pay AWS directly.

If one bucket, can I please get some pointers for how I provide different users access to different folders within one bucket.

It's easy. just create a policy that restricts them to one folder in a bucket..

You want to create a group per department, since it's bad to have rules for each user. Always put users in groups, and only put permissions on groups.

Is there some better documentation

S3 is fairly complex, but the documentation is pretty good. You just need to read it end-to-end. Frankly, if your company has 10,000 people, they should be paying for lots of AWS architect training for you. Competent consultants who can help are quite expensive, and it's hard to tell who is competent.

Upvotes: 2

Related Questions