Reputation: 27659
I have two tables:
CREATE TABLE [dbo].[CustomerEmails](
[id] [int] IDENTITY(1,1) NOT NULL,
[datecreated] [datetime] NULL,
[UID] [nvarchar](250) NULL,
[From] [nvarchar](100) NULL,
[To] [nvarchar](100) NULL,
[Subject] [nvarchar](max) NULL,
[Body] [nvarchar](max) NULL,
[Dated] [datetime] NULL,
CONSTRAINT [PK_CustomerEmails] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
)
CREATE TABLE [dbo].[CustomerEmailIds](
[Email] [varchar](200) NULL
) ON [PRIMARY]
You may be thinking that why i am not using the EmailId in table CustomerEmails instead of using the emails itself? what i mean is that i can add a column EmailId (INT) to the table CustomerEmailIds & then i can refer that column instead of CustomerEmails.From & CustomerEmails.To?? Guys the issue is the table CustomerEmails is used by some other application & that application just keeps the track of emails sent/received through OutLook. The table CustomerEmailIds have customer emails & these emails come into system from the application i am working on.
So the Table CustomerEmails have 7,00,000+ records while table CustomerEmailIds have 1,00,000+ records.
I need to find out the emails from table CustomerEmails, based on the emails in CustomerEmailIds table.
The query i am using is:
SELECT
e.*
FROM CustomerEmails e
WHERE EXISTS
(
SELECT Email
FROM CustomerEmailIds c
WHERE ( ISNULL(e.[From],'') + '/' + ISNULL(e.[To],'') ) LIKE '%'+c.Email+'%'
)
Some facts:
1- I am using SQL Server 2008
2- Sorry guys, i forgot to mention that the CustomerEmails.**To** can contains multiple comma separated emails like: [email protected],[email protected],[email protected]
3- Because of the fact2 the c.Cs3Emails+'%' OR to= c.Cs3Emails will not list the desired results that's why i am using '%'+c.Cs3Emails+'%'
Latest Findings: Guys the above query return wrong results..... & i don't know why??
But below query works fine:
SELECT
e.*
FROM CustomerEmails e
WHERE (ISNULL(e.[From],'') + '/' + ISNULL(e.[To],'')) LIKE '%[email protected]%'
Upvotes: 0
Views: 735
Reputation: 27659
Well guys first look at the query which is returning the wrong results:
SELECT
e.*
FROM CustomerEmails e
WHERE EXISTS
(
SELECT Email
FROM CustomerEmailIds c
WHERE ( ISNULL(e.[From],'') + '/' + ISNULL(e.[To],'') ) LIKE '%'+c.Email+'%'
)
There is no issue in this query, the issue is that some of the emails in CustomerEmailIds table are invalid emails which are ('.','@','0','-') & that's why the query is returning all the CustomerEmails where these invalid emails exists.
Thanks to Kev Riley! the persons who helped me find out the cause of this issue here!
Upvotes: 0
Reputation: 3106
ADD INDEXING
http://beginner-sql-tutorial.com/sql-index.htm
Upvotes: 0
Reputation: 7612
Yaqub, The where clause in your query will never use an index. This is because of 2 problems:
1) you are doing string concatenation on the fly to compare
ISNULL(e.[From],'') + '/' + ISNULL(e.[To],'') )
2) you are using wild card at both end after like
LIKE '%'+c.Cs3Emails+'%'
To optimize, this is what I suggest:
1) You create another column to store the concatenated value of the emails. Extra storage, but this will save you from running slow. Then you can create an index on that column to speed things up.
2) Is it possible that you change the like condition?
LIKE c.Cs3Emails+'%'
Removing the %
from begining might use the index (if any on c.Cs3Emails)
Hope this helps
Upvotes: 3
Reputation: 26861
This part of the query:
WHERE ( ISNULL(e.[From],'') + '/' + ISNULL(e.[To],'') ) LIKE '%'+c.Cs3Emails+'%'
Forces a full table scan because it uses functions in the WHERE
clause and because of like '%....%' - the % sign in the begining. Can you rewrite that part to use fields/constants?
You should make your queries to benefit on the indexed fields - for example, if you have an index on field c.Cs3Emails
, your query should look like
WHERE c.Cs3Emails = 'some_value_here'
the some_value_here
value should not be the result of functions output (if this ispossible)
Even if your query looks like
WHERE c.Cs3Emails = 'some_value_here%'
An index can be used, but not when having
WHERE c.Cs3Emails = '%some_value_here%'
Because here, is like you would try to find in a phonebook after the second letter of the name.
Try to read more about database indexes and how to use them
Also, I think you can rewrite your query like:
SELECT
e.*
FROM Emails e
WHERE EXISTS
(
SELECT Cs3Emails
FROM Cs3EmailsForPurge c
WHERE e.From = c.Cs3Emails OR e.To = c.Cs3Emails
)
Please try this query and see if it returns the same results as yours
If my query returns the same result set and performance does not improve, add an index on field Emails.From
and an index on Emails.To
. This will improve your query execution time.
Upvotes: 2
Reputation: 11987
the idea behind this query is flawed. i understand that you are trying to chain the messages together using a tracking number of a sort, probably in the subject line, but the way this should work is that as emails are imported into the database their tracking number should be found and then extracted into its own field. with this you pay a small hit to each import, but receive massive benefits when querying your data.
i would suggest you update your current data and rewrite the import. that query is not going to get any more performant.
Upvotes: 2