How can I retriever the number of DISTINCT projects in one table to store a boolean for each user and each project in another table?

Question

I have a table for project announcements with the structure -

**Announcements Table**

| ---------- | ----------------- |
|  Project   |   Announcement    | 
| ---------- | ----------------- | 
|  [String]  |    [String]       |
| ---------- | ----------------- |

I want to allow users to subscribe to projects they would like to receive announcements for so that they may receive these announcements by email. Therein lies my problem.

I could structure the database as follows -

**Subscription Table**

| ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | 
|  Name      |  Email     |  Project1  |  Project2  |  Project3  |  ...etc... | 
| ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | 
| [String]   | [String]   | [Boolean]  | [Boolean]  | [Boolean]  | [Boolean]  |
| ---------- | ---------- | ---------- | ---------- | ---------- | ---------- |

with a Boolean representing whether or not each user is subscribed to each project.

This has the advantage of being a simple structure, however the disadvantage that any new projects would have to be added manually. This should not be terribly difficult because there are only likely to be <4 new projects per year added to the Announcements table, though obviously it is not ideal.

Obviously to get the number of distinct projects in the announcements table DISTINCT or GROUP BY could be used. But I am not sure how to use this in order to store data / generate data for the second table.

Frazz · Accepted Answer

You really shouldn't be doing that. Having a list of boolean columns will give you problems... a lot of problems. As projects change over time, you will have to do maintenance on the table and it will be hell.

If you want to start with a good normalized structure, you should look at something like this:

Users

id    INTEGER PRIMARY KEY
Name  VARCHAR
email VARCHAR

Projects

id   INTEGER PRIMARY KEY
Name VARCHAR

Annoucements

id         INTEGER PRIMARY KEY
project_id INTEGER REFERENCES Projects (id)
text       VARCHAR

Subscriptions

user_id    INTEGER REFERENCS Users (id)
project_id INTEGER REFERENCES Projects (id)

with a UNIQUE or PRIMARY key on user_id, project_id

This is how you should handle many-to-many relations. No structure changes are required when you need a new project. It is all reduced to adding or removing rows from one or more tables... rows, not columns.

For example... if you want a list of all annoucements that users have subscribed to:

SELECT u.Name, u.email, p.Name, a.text
FROM subscriptions s
JOIN users         u ON u.id=s.user_id
JOIN projects      p ON p.id=s.project_id
JOIN annoucements  a ON a.project_id=s.project_id

Separating Projects from Announcements is not necessary, but I do advise it. If you have few projects and many announcements, or if you have different people managing projects and annoucements, then you should definitely use 2 different tables. Using a VARCHAR column to identify groups of rows is not a good habit for various reasons (spelling mistakes, upper/lower case). Your foreign keys should always be field the user knows nothing about and will never want to change.

If you ever want to change a project name, you will have to do it massively. If you ever want to add other info concerning projects, you cannot do it on the annoucements table without duplicating data... which is not a good thing... and will bring integrity problems sooner or later.

How can I retriever the number of DISTINCT projects in one table to store a boolean for each user and each project in another table?

Answers (1)

Related Questions