DavidS
DavidS

Reputation: 2444

efficient design for database with multiple many to many relationships

I'm pretty much building my first SQL database and am not sure if I'm going about things in an efficient manner. My data has the following structure: I have a bunch of images, from each of which I extract multiple features. A feature consists of a vector containing one or more floating point numbers. So I have images, features, vector indices, and data values.

My current thinking is to have a single table consisting of {image_id, feature_id, vector_index, data_value}. I guess the primary key would be {image_id, feature_id, vector_index}.

It seems to me that I have multiple many to many relationships, i.e. multiple indices can correspond to a single feature and multiple features can correspond to a single index, multiple images can correspond to the same feature and multiple features can correspond to the same image, etc. From my internet searching it seems like I should be making link tables to make things more efficient, but I can't figure out how that would be done. Any help would be greatly appreciated.

UPDATE:

To recap: I have data_values, which are floats. Each data value has attributes image_id, feature_id, and vector_index. The tuple {image_id, feature_id, vector_index} uniquely identifies an individual data_value.

There are many-to-many relationships between any pair of attributes: image_id:feature_id, image_id:vector_index, and vector_index:feature_id.

Typical query: Retrieve all data values corresponding to a subset of image_ids and feature_ids. I will likely never filter based on vector_index. For my application I will need the get the queried data_values into a 2D matrix where each row corresponds to a particular image_id, and each column corresponds to a particular {feature_id, vector_index} tuple.

Current plan: Get rid of vector_index. Instead turn each feature vector into a single string using, e.g., JSON. Construct a database with the following tables:

images: {image_id (PK), image_name}  
features: {feature_id (PK), feature_name}  
data_values: {data_id (PK), image_id (FK), feature_id (FK), data_vector}

Upvotes: 0

Views: 168

Answers (1)

Jasen
Jasen

Reputation: 12412

have a table image and a table feature and to join them a table image_feature that has two columns, one being the primary key of image the other being the primary key of feature.

Upvotes: 1

Related Questions