Peyman abdollahy
Peyman abdollahy

Reputation: 829

how have an optimize and good update statement in mysql?

I want to write a statement to update for example is_new to 0 with a condition like mail.user_id=$user.id etc.

update mails set `is_new=0` WHERE `mails.user_id=$user.id`

should i add another condition for prevent updating records that already have is_new equal to zero? like this

update mails set is_new=0 WHERE mails.user_id=$user.id and is_new!=0

which statment is the best? and do u think it's required to have second condition too?

Upvotes: 2

Views: 147

Answers (3)

N.B.
N.B.

Reputation: 14071

From my experience and from my opinion, the answer is clear here - do not add is_new != 0. There are several reasons and I'll try to list them:

  • SQL is about explaining what you want to do. Adding the condition would be also telling how to do it. MySQL is optimized enough to avoid working more than it should.

  • Adding the condition doesn't make your query more readable. Removing it does. It clearly states you want to update is_new to a value based on user_id

  • Adding the condition has the potential to hurt performance. Note the word potential - I'm not saying it will, but it might. If you can't optimize how something works, why would you potentially make it slower? Just don't. In your case, we're probably talking about abysmal performance values, but still.

  • Adding the condition will make MySQL do more work. Not much, but still - it will just do more. You don't really need it.

  • MySQL performs a check whether the row has indeed changed before writing anything new to it. If the is_new was 0 and you want to set it to 0 again - MySQL will notice that and it will skip updating that row (write operation is much more costly than a check operation, so MySQL will perform that check).

Quote from manual:

If you set a column to the value it currently has, MySQL notices this and does not update it.

Relevant source code, if needed: here

Upvotes: 1

Lajos Arpad
Lajos Arpad

Reputation: 76508

The thing is that when you set the value in your update, it is effectively overwriting the old value, which takes longer than checking whether the value should be overriden. So, you have the following cases:

Case1: You update every is_new to 0

  • if user_id matches and is_new is not 0, then it will correctly be updated to 0
  • if user_id matches and is_new is already 0, then it will update it to 0, but it will take a longer time, since it is a write operation

Case2: You set is_new to 0 if it is not already 0

  • if user_id matches and is_new is not 0, then it will check whether it is 0 and consequently and correctly update it to 0
  • if user_id matches and is_new is already 0, then it will check whether it is 0 and will leave the record as it is

Let's suppose you have to update x records , however, from the x records, only y have a not 0 is_new. Checking whether is_new is 0 takes a time and updating is_new to 0 takes b time for a record on average. In this case:

If you do not filter out the records where is_new is already 0, then: you will update all the records, which takes x * b time.

If you filter out the records where is_new is already 0, then: you will check is_new for the records, which takes x * a and then you will update y records, which takes y * b.

Intuitively, I believe that writing takes so much time, that:

x * a + y * b < x * b

However, the sure thing would be to benchmark it and see some measurements, because no one on earth will tell you how your mysql executes your query under the hood.

EDIT:

Based on comments, apparently, the documentation states that if a field is updated to its current value, then MySQL will notice the issue and will not execute the update.

If you set a column to the value it currently has, MySQL notices this and does not update it.

Source.

This does not change the fact that in case an optimization is needed, experiments should be done, since, if the quote is proven wrong, it would not be the first time a documentation is flawed. Also, the ideas should be useful for other cases as well, since the quote only touches conditions with the = operator and in case other conditions should be checked, the quoted MySQL feature would be inapplicable.

Upvotes: 1

Topher Hunt
Topher Hunt

Reputation: 4804

As a very general principle, I structure my UPDATE queries to only "touch" the records that actually need touching. In my mind this is less for performance reasons and more for readability reasons: the latter form of the query more precisely states what you're trying to do -- namely, you want to set is_new=0 for only those records where is_new is not already 0. So for that reason alone, I'd say the extra condition should be present.

On a more general note, there's a famous computer science quote that "Premature optimization is the root of all evil". When working with MySQL queries I don't always agree with this quote, but it makes an important point: If you're getting hung up about whether to optimize something, and you haven't actually tried out the two formats and/or taken measurements to determine how much difference the optimization makes, you're wasting your time. If you're optimizing without knowing what difference it will make, the only outcome you know for certain is that you're making your job more complicated.

So don't optimize for performance reasons until you know (or have strong reasons to believe) that the optimization will make a difference. But always optimize for readability and standardization reasons, since you do know that people will have to read your code in the future. At least that's my rule.

Upvotes: 1

Related Questions