Reputation: 104841
Hi.
I need to log information about about every step that goes on in the application in an SQL DB. There are certain tables, I want the log should be related to: Product - should log when a product has been created changed etc. Order - same as above Shipping - same etc. etc. etc.
The data will be need to be retrieved often.
I have few ideas on how to do it:
Example of No. 3 in the above list:
CREATE TABLE [dbo].[Log](
[LogId] [int] IDENTITY(1,1) NOT NULL,
[UserId] [int] NULL,
[Description] [varchar](1024) NOT NULL,
CONSTRAINT [PK_Log] PRIMARY KEY CLUSTERED
(
[LogId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[Log] WITH CHECK ADD CONSTRAINT [FK_Log_Table] FOREIGN KEY([UserId])
REFERENCES [dbo].[Table] ([TableId])
GO
ALTER TABLE [dbo].[Log] CHECK CONSTRAINT [FK_Log_Table]
---------------------------------------------------------------------
CREATE TABLE [dbo].[LogReference](
[LogId] [int] NOT NULL,
[TableName] [varchar](32) NOT NULL,
[RowId] [int] NOT NULL,
CONSTRAINT [PK_LogReference] PRIMARY KEY CLUSTERED
(
[LogId] ASC,
[TableName] ASC,
[RowId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
ALTER TABLE [dbo].[LogReference] WITH CHECK ADD CONSTRAINT [FK_LogReference_Log] FOREIGN KEY([LogId])
REFERENCES [dbo].[Log] ([LogId])
GO
ALTER TABLE [dbo].[LogReference] CHECK CONSTRAINT [FK_LogReference_Log]
---------------------------------------------------------------------
CREATE FUNCTION GetLog
(
@TableName varchar(32),
@RowId int
)
RETURNS
@Log TABLE
(
LogId int not null,
UserId int not null,
Description varchar(1024) not null
)
AS
BEGIN
INSERT INTO @Log
SELECT [Log].LogId, [Log].UserId, [Log].Description
FROM [Log] INNER JOIN
LogReference ON [Log].LogId = LogReference.LogId
WHERE (LogReference.TableName = @TableName) AND (LogReference.RowId = @RowId)
RETURN
END
GO
Upvotes: 5
Views: 2367
Reputation: 700910
I would definitely go for option 3, for several reasons:
Data should be in the fields of a table, not as a table name (option 2) or a field name (option 1). That way the database gets easier to work with and easier to maintain.
Narrower tables genrally perform better. The number of rows has less impact on the performance than the number of fields.
If you have a field for each table (option 1), you are likely to get a lot of empty fields when only a few of the tables are affected by an operation.
Upvotes: 2
Reputation: 147374
If you're talking about large volumes of data (millions of rows+), then you will get have a benefit from using different tables to store them in.
e.g. basic example 50 million log entries, assuming 5 different "types" of log table Better to have 5 x 10 million row tables than 1 x 50 million row table
INSERT performance will be better with individual tables - indexes on each table will be smaller and so quicker/easier to be updated/maintained as part of the insert operation
READ performance will be better with individual tables - less data to query, smaller indexes to traverse. Also, sounds like you'd need to store an extra column to identify what type of Log entry a record is (Product, Shipping....)
Essentially, this is about partitioning data. From SQL 2005 onwards, it has built in support for partitioning (see here) but you need Enterprise Edition for that, which basically allows you to partition data in one table to improve performance (e.g. you'd have your one Log table, and then define how the data within it is partitioned)
I listened to an interview with one of the eBay architects recently, who stressed the importance of partitioning when needing performance and scalability and I strongly agree based on my experiences.
Upvotes: 2
Reputation: 86799
Try to implement your data access layer in a way so that you can change from one database model to another if needed - that way you just pick one and worry about performance implications later.
Without doing some performance testing and having an accurate idea of the sorts of load your going to get its going to be difficult to optimise as the performance depends on a number of factors, such as the number of reads, the number of writes, and whether or not the reads and writes are likely to conflict and cause locking.
My preference would be for option 1 btw - its simplest to do and there are a number of tweaks you can do to help out fix various sorts of problems you might have.
Upvotes: 0
Reputation: 3889
Be careful with preoptimizing databases. Most databases are reasonably fast and somewhat complicated. You want to run a test for efficiency first.
Second putting everything in one table makes it more likely that the results you want are in the cache which will speed up performance immensely. Unfortunately it also makes it much more likely that you will have to search a gigantic table to find what you are looking for. This can be partly solved with an index, but indexes don't come free (they make writing more expensive, for one).
My advice would be to make a test to see if the performance really matter and then test the different scenarios to see which is the fastest.
Upvotes: 3