Dilip
Dilip

Reputation: 1162

Is single table with multiple updates better than multiple tables inserted at different times in Redshift

I have a task fact table with following columns

As you can see, these timestamps can get updated at any point in time after the task is created. This would mean this column will keep getting updated in redshift.

Q1: Do I create multiple tables with assignment, comment, close and join them when needed?

Following are the reports that is done

Q2: Is it okay to store the First comment here as it's a columnar storage. This text field is only used to export as csv.

Upvotes: 0

Views: 102

Answers (1)

NickW
NickW

Reputation: 9798

You wouldn't create multiple tables, you would create a single fact table with an FK to an Event Type Dimension e.g.

  • Task ID (BIGINT)
  • User ID (BIGINT)
  • Event ID (BIGINT)
  • Event Timestamp (TIMESTAMP)
  • Comment (VARCHAR(3000))

But if you did this you obviously couldn't also hold the metrics (e.g. First assigned time taken).

Whether it is better for the fact table to be insert-only (and so be faster) and metrics to be calculated at query time, or whether it is better for the fact table to do updates (and so be slower) but have the metrics pre-calculated and therefore the queries run faster - only you can know, because:

a) Only you can find out how much faster/slower each approach is

b) Only you know what "better" means for your specific circumstances

Upvotes: 1

Related Questions