Reputation: 829
I want to write a statement to update for example is_new
to 0 with a condition like mail.user_id=$user.id
etc.
update mails set `is_new=0` WHERE `mails.user_id=$user.id`
should i add another condition for prevent updating records that already have is_new
equal to zero? like this
update mails set is_new=0 WHERE mails.user_id=$user.id and is_new!=0
which statment is the best? and do u think it's required to have second condition too?
Upvotes: 2
Views: 147
Reputation: 14071
From my experience and from my opinion, the answer is clear here - do not add is_new != 0
. There are several reasons and I'll try to list them:
SQL is about explaining what you want to do. Adding the condition would be also telling how to do it. MySQL is optimized enough to avoid working more than it should.
Adding the condition doesn't make your query more readable. Removing it does. It clearly states you want to update is_new
to a value based on user_id
Adding the condition has the potential to hurt performance. Note the word potential - I'm not saying it will, but it might. If you can't optimize how something works, why would you potentially make it slower? Just don't. In your case, we're probably talking about abysmal performance values, but still.
Adding the condition will make MySQL do more work. Not much, but still - it will just do more. You don't really need it.
MySQL performs a check whether the row has indeed changed before writing anything new to it. If the is_new
was 0
and you want to set it to 0
again - MySQL will notice that and it will skip updating that row (write operation is much more costly than a check operation, so MySQL will perform that check).
Quote from manual:
If you set a column to the value it currently has, MySQL notices this and does not update it.
Relevant source code, if needed: here
Upvotes: 1
Reputation: 76508
The thing is that when you set the value in your update, it is effectively overwriting the old value, which takes longer than checking whether the value should be overriden. So, you have the following cases:
Case1: You update
every is_new
to 0
user_id
matches and is_new
is not 0
, then it will correctly be updated to 0
user_id
matches and is_new
is already 0
, then it will update
it to 0
, but it will take a longer time, since it is a write operationCase2: You set
is_new
to 0
if it is not already 0
user_id
matches and is_new
is not 0
, then it will check whether it is 0
and consequently and correctly update it to 0
user_id
matches and is_new
is already 0
, then it will check whether it is 0
and will leave the record as it isLet's suppose you have to update
x
records , however, from the x
records, only y
have a not 0
is_new. Checking whether is_new
is 0
takes a
time and updating is_new
to 0
takes b
time for a record on average. In this case:
If you do not filter out the records where is_new
is already 0
, then:
you will update
all the records, which takes x * b
time.
If you filter out the records where is_new
is already 0
, then:
you will check is_new
for the records, which takes x * a
and then you will update y
records, which takes y * b
.
Intuitively, I believe that writing takes so much time, that:
x * a + y * b < x * b
However, the sure thing would be to benchmark it and see some measurements, because no one on earth will tell you how your mysql executes your query under the hood.
EDIT:
Based on comments, apparently, the documentation states that if a field is updated to its current value, then MySQL will notice the issue and will not execute the update.
If you set a column to the value it currently has, MySQL notices this and does not update it.
This does not change the fact that in case an optimization is needed, experiments should be done, since, if the quote is proven wrong, it would not be the first time a documentation is flawed. Also, the ideas should be useful for other cases as well, since the quote only touches conditions with the = operator and in case other conditions should be checked, the quoted MySQL feature would be inapplicable.
Upvotes: 1
Reputation: 4804
As a very general principle, I structure my UPDATE queries to only "touch" the records that actually need touching. In my mind this is less for performance reasons and more for readability reasons: the latter form of the query more precisely states what you're trying to do -- namely, you want to set is_new=0
for only those records where is_new
is not already 0. So for that reason alone, I'd say the extra condition should be present.
On a more general note, there's a famous computer science quote that "Premature optimization is the root of all evil". When working with MySQL queries I don't always agree with this quote, but it makes an important point: If you're getting hung up about whether to optimize something, and you haven't actually tried out the two formats and/or taken measurements to determine how much difference the optimization makes, you're wasting your time. If you're optimizing without knowing what difference it will make, the only outcome you know for certain is that you're making your job more complicated.
So don't optimize for performance reasons until you know (or have strong reasons to believe) that the optimization will make a difference. But always optimize for readability and standardization reasons, since you do know that people will have to read your code in the future. At least that's my rule.
Upvotes: 1