Reputation: 11
I’m new to Cassandra, I want to create a social network website. I would like to know how should I store image? should it store it as file system or store it in Cassandra table? If storing image in table, how to structure the table?
Upvotes: 1
Views: 201
Reputation: 1
Save images of social web sites in Cassandra table, unless your file system is network distributed, high available and scalable as Cassandra. You can create your own image table with chunks to manage it by yourself.
Or you can try https://github.com/simpleisbeauty/SimpleStore to use store input/output stream to read/write images in Cassandra just like local file system.
Upvotes: 0
Reputation: 8812
should it store it as file system or store it in Cassandra table?
It depends on the size of your images. Cassandra is a database, designed primarily to store structured data. Raw files are not structured data.
However one can still want to use Cassandra for binary blob storage because of its ability to handle multi data-centers and high availability, this is a valid reason too.
If storing image in table, how to structure the table?
If the maximum ever possible size for your images is around 1Mb - 2Mb, you can try to store this image in a regular blob column like this
CREATE TABLE images(
image_id uuid,
name text,
size_in_bytes bigint,
author text,
...
content blob,
PRIMARY KEY(image_id)
);
//Load the image by id
SELECT * FROM images WHERE image_id=xxx;
Now, if you think the image size can grow wildly up to an arbitrary size, your best chance is to manually split it in your application into chunks of fixed size (let's say 64kb for example) and store all the chunks in a wide partition:
CREATE TABLE images(
image_id uuid,
name text static,
size_in_bytes bigint static,
author text static,
...
chunk_count int static,
chunk_id uuid,
content blob,
PRIMARY KEY(image_id, chunk_id)
);
//Load all the chunks of the image
//Use iterator to fetch chunks page by page
SELECT chunk_id,content FROM images WHERE image_id=xxx;
Please notice that in this case, all meta data columns (name, size_in_bytes, author ...) should be static e.g. only stored once and not repeated for every chunk
Upvotes: 3