superloop
superloop

Reputation: 95

Image storing/tagging solution

We are creating a site which will have users uploading images that's classifiable and searchable.

My question is surrounding the image storing thereof, what would make a solid maintainable solution?

I've looked at S3 - it looks promising.

If S3 is a good option, where would I store the references to the objects (along with the metadata/tags)?

Thanks :)

Upvotes: 1

Views: 142

Answers (1)

Michael - sqlbot
Michael - sqlbot

Reputation: 179412

If I were architecting such a system, I would certainly look no further than S3 for scalability and durability for actually storing the images -- and thumbnails -- and metadata, to some extent.

S3 metadata storage is limited to 2KB (total number of bytes of all keys and all values combined), is limited to US-ASCII, and is not indexed -- you have to fetch the metadata for the specific object. For many applications, this is entirely sufficient but that's very doubtful in your case.

http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html#object-metadata

So, the question "is S3 a good option" is easily answered: if you mean among AWS services, the answer is yes, it's difficult to argue that it is the best fit.

You may also consider CloudFront -- not instead of, but in addition to S3. It can improve load times by caching your "popular" content closer to where users are located, among other things.


Where to store the references to the objects goes off into the land of "opinion based," which we don't do on Stack Overflow. The answer is, of course, "in a database," but AWS has options here.

I'm a relational database DBA, so of course, my inclination is that everything should have a relational database (such as RDS) as its authoritative data store, while others would probably say the DynamoDB NoSQL database offering would be a useful data store.

From there (wherever "there") is, CloudSearch could be populated with the metadata, keywords, etc., for processing the actual search operations, using indexes it builds which are more potentially better-suited to search-intensive operations than proper databases. I would not, however, try to use CloudSearch as the authoritative store of all your valuable metadata. Search indexes should be treated as disposable, rebuildable assets... although I fear even that statement might strike some as being opinion-based.

One thing that isn't a matter of opinion is that all of these various cloud services allow you to spin up a substantial proof-of-concept infrastructure at costs that are so low as to have been unimaginable just a few years ago... so you can try them, play with them, and throw them away if they don't do what you expect. You don't have to buy before you try.

Upvotes: 2

Related Questions