fingers10
fingers10

Reputation: 7937

Is it efficient to use relational operators on date or integer values stored in char column in ms sql server?

I have been in development for 5 years. But this looks completely new to me. Recently I'm working on Legacy application where date and integer values are stored in CHAR column in ms sql server. And the shocking thing (at-least for me because I have never ever thought in this direction before) is that relational operators work with date values stored in CHAR column.

I'm afraid that this will have a performance issue. Is that correct or I'm wrong? Any implicit casting or conversion will happen in ms sql server before applying relation operators? Or this is normal to do in ms sql server?

We are planning to migrate this and want to check if I can proceed with this setup or date needs to be stored in DATE type column and integer to be in int type column in ms sql server?

I also noted that all the dates are stored as string in "YYYY/MM/DD" format in table.

Here is the test table that I have created and tested.

CREATE TABLE [dbo].[TestTable](
    [Date] [char](10) NULL,
    [Integer] [char](10) NULL
) ON [PRIMARY]
GO

SELECT * FROM TESTTABLE

enter image description here

SELECT * FROM TESTTABLE where [Date] > '1993/02/10'

enter image description here

Please advise.

Upvotes: 3

Views: 1286

Answers (3)

I think that you're going to have more trouble with Date conversion from different formats, than with performance.

Lets say that your system (or some of your systemS), decide to send something like '1993-02-10', you'd have to "treat" this new string because it's never going to match for your queries.

Coming back to code and performance issues, I've made this quick example to see if we can catch the cost of "Implicit casting", but it seems to be despicable (See the CONVERT_IMPLICIT at the end of the Query plan?):

-- Create a temp table to test
CREATE TABLE #TestTable (
    [Date] [char](10) NULL,
    [Integer] [char](10) NULL
)

-- And an index to search for [Integer]
CREATE INDEX Idx_Integer ON #TestTable ([Integer]);

// Create some data
INSERT INTO #TestTable( [Date], [Integer] ) VALUES('2019/01/01', 1)
INSERT INTO #TestTable( [Date], [Integer] ) VALUES('2019/02/01', 2)
INSERT INTO #TestTable( [Date], [Integer] ) VALUES('2019/03/01', 3)
INSERT INTO #TestTable( [Date], [Integer] ) VALUES('2019/04/01', 4)
INSERT INTO #TestTable( [Date], [Integer] ) VALUES('2019/05/01', 5)
GO 100000

// And a unique record so SQL uses the index
INSERT INTO #TestTable( [Date], [Integer] ) VALUES('2019/05/01', 7)

// Select with Query-Plan (Ctrl+M)
SELECT      [Integer]
    FROM #TestTable
    WHERE   [Integer] = 7

DROP TABLE #TestTable;

enter image description here

VS:

enter image description here

Happy coding!

Upvotes: 3

Gordon Linoff
Gordon Linoff

Reputation: 1269633

It is better to store values using the correct data types.

However, if dates are stored as YYYYMMDD format -- consistently -- then comparison and ordering operations work fine. That is, the comparisons as strings are the same as the comparisons as dates.

If you have an opportunity to fix the data, you can try. Do be warned that that partial fixes might have a big impact on database performance. If such columns are used for joins -- for instance -- then the type conversion generally kills optimizations. Of course, it is fine if the values have the same types in both columns.

If this is a working legacy applications, then just plan on fixing it when the application is replaced.

Upvotes: 4

GMB
GMB

Reputation: 222432

Don't store dates as strings. Use the relevant date-like datatype instead.

While a string format like YYYY/MM/DD allows you to do equality and inequality comparisons and sorting, other limitations will pop up, for example:

  • date manipulations, eg adding or substracting intervals, are tedious with strings
  • data integrity cannot be enforced (how to you ensure that your string is a valid date?)

To work around these, you will soon find yourself converting strings to dates in your queries, which is highly inefficient.

Upvotes: 7

Related Questions