Reputation: 2444
I'm pretty much building my first SQL database and am not sure if I'm going about things in an efficient manner. My data has the following structure: I have a bunch of images, from each of which I extract multiple features. A feature consists of a vector containing one or more floating point numbers. So I have images, features, vector indices, and data values.
My current thinking is to have a single table consisting of {image_id, feature_id, vector_index, data_value}. I guess the primary key would be {image_id, feature_id, vector_index}.
It seems to me that I have multiple many to many relationships, i.e. multiple indices can correspond to a single feature and multiple features can correspond to a single index, multiple images can correspond to the same feature and multiple features can correspond to the same image, etc. From my internet searching it seems like I should be making link tables to make things more efficient, but I can't figure out how that would be done. Any help would be greatly appreciated.
UPDATE:
To recap: I have data_values
, which are floats. Each data value has attributes image_id
, feature_id
, and vector_index
. The tuple {image_id
, feature_id
, vector_index
} uniquely identifies an individual data_value
.
There are many-to-many relationships between any pair of attributes: image_id
:feature_id
, image_id
:vector_index
, and vector_index
:feature_id
.
Typical query: Retrieve all data values corresponding to a subset of image_id
s and feature_id
s. I will likely never filter based on vector_index
. For my application I will need the get the queried data_value
s into a 2D matrix where each row corresponds to a particular image_id
, and each column corresponds to a particular {feature_id
, vector_index
} tuple.
Current plan: Get rid of vector_index
. Instead turn each feature vector into a single string using, e.g., JSON. Construct a database with the following tables:
images: {image_id (PK), image_name}
features: {feature_id (PK), feature_name}
data_values: {data_id (PK), image_id (FK), feature_id (FK), data_vector}
Upvotes: 0
Views: 168
Reputation: 12412
have a table image
and a table feature
and to join them a table image_feature
that has two columns, one being the primary key of image the other being the primary key of feature.
Upvotes: 1