Jump to content
MakeWebGames

Simple MySQL optimization


Recommended Posts

It's a small optimization, but something I see in a lot of noobish code...

Take for instance, a query like this:

 

UPDATE users
SET bankmoney = bankmoney + 10000
WHERE ((bankmoney >= 0) AND (fedjail <= 0) AND (last_login >= (UNIX_TIMESTAMP() - (3 * 86400))))

 

The MySQL manual states the following about optimization of the where clause:

Removal of unnecessary parentheses

A query like the one above could be written like this:

UPDATE users
SET bankmoney = bankmoney + 10000
WHERE bankmoney >= 0 AND fedjail <= 0 AND last_login >= UNIX_TIMESTAMP() - 3 * 86400

 

All those extra parenthesis don't actually do anything for this query. The one tricky part is at the end.

UNIX_TIMESTAMP() - 3 * 86400

Because MySQL follows mathematical order of operations 3 is multiplied by 86400 first, and then that value is subtracted from the current unix timestamp. Thus, even the parenthesis used to force execution order there are unnecessary.

 

I hope this helps.

Naturally, this is one of those deals that is akin to the difference of print and echo.

I wouldn't go through thousands of lines of code removing parenthesis, but I certainly wouldn't write new code that was bogged down with them either.

Happy MySQL'in folks!

And stay tuned for the next installment of the Horizons Workshop!

Link to comment
Share on other sites

Re: Simple MySQL optimization

Sure thing ;)

I've read many parts of the manuals more times than I care to remember. lol, I've even read many of the comments people post. There's some fantastic stuff in there for those that want to get some more optimization techniques straight from the developers. :)

Link to comment
Share on other sites

Guest Anonymous

Re: Simple MySQL optimization

10/10 -- Nice dig...

Whilst you may of course be correct semantically, optimizations at this level, especially given the query as presented, are in general considered useless.

Consider, if you will, the actual conditional part:

#1 bankmoney >= 0

#2 fedjail <= 0

#3 last_login >= UNIX_TIMESTAMP() - 3 * 86400

Now while each of these parts will limit the amount of rows searched, exactly how does this happen?

Without any indices, there is a complete row-scan, something that is not optimal.

With indicies on the relevant fields : bankmoney, fedjail, and last_login; you will have a *much* better optimization than basic parenthesis removal.

There's also the case to consider of calling the function UNIX_TIMESTAMP(). Passing in the actual (current) time - 3 days would result in a slight performance increase as the mathematic parsing routines of the SQL interpreter are not so heavily used.

And finally, whilst Floydian is roughly correct here, there is a case *for* using parenthesis as it can become easier to read, is certainly easier to maintain, and somewhat easier to test sub expressions.

At the end of the day -- relying on comments on optimization without understanding the underlying data-definition is risky. Judicious use of the EXPLAIN keyword (and perhaps a true query performance analyzer) will help you with *your* project.

Of course, I accept that a lot of people just don't have the experience to look at a query and a data definition schema in order to improve its level of optimization, but with time, they may learn. Reading the manual thoroughly is possibly the key here, but also listening to others helps.

Link to comment
Share on other sites

  • 5 months later...

Re: Simple MySQL optimization

 

10/10 -- Nice dig...

Whilst you may of course be correct semantically, optimizations at this level, especially given the query as presented, are in general considered useless.

Consider, if you will, the actual conditional part:

#1 bankmoney >= 0

#2 fedjail <= 0

#3 last_login >= UNIX_TIMESTAMP() - 3 * 86400

Now while each of these parts will limit the amount of rows searched, exactly how does this happen?

Without any indices, there is a complete row-scan, something that is not optimal.

With indicies on the relevant fields : bankmoney, fedjail, and last_login; you will have a *much* better optimization than basic parenthesis removal.

There's also the case to consider of calling the function UNIX_TIMESTAMP(). Passing in the actual (current) time - 3 days would result in a slight performance increase as the mathematic parsing routines of the SQL interpreter are not so heavily used.

And finally, whilst Floydian is roughly correct here, there is a case *for* using parenthesis as it can become easier to read, is certainly easier to maintain, and somewhat easier to test sub expressions.

At the end of the day -- relying on comments on optimization without understanding the underlying data-definition is risky. Judicious use of the EXPLAIN keyword (and perhaps a true query performance <censored>yzer) will help you with *your* project.

Of course, I accept that a lot of people just don't have the experience to look at a query and a data definition schema in order to improve its level of optimization, but with time, they may learn. Reading the manual thoroughly is possibly the key here, but also listening to others helps.

Thanx Nyna and thanks Floydian i learn alot from just reading all your posts ;)

Link to comment
Share on other sites

Re: Simple MySQL optimization

 

It's a small optimization, but something I see in a lot of noobish code...

I'd imagine you see it there because it's also in the high-end, professionally-authored code available in books and tutorials they've either learned or cut & paste from. The reason it's done is for readability, which is much more important overall than trying to optimize the already extremely tight order of operation handling mechanisms in mysql.

If you're really wanting to optimize that particular process, there are some very real things you should be doing:

First and foremost, the engine; is this table insert and read only, or are there updates? If there are updates, you should be creating the table using the InnoDB engine, which allows for row-level locking.

Next, how many records are in the table? More than a couple thousand and your performance goes to hell unless you've got indexes in place, and THAT is where optimizations of the query, and the query plan really begin to matter.

Beyond this, you get to the real meat of the issue, which is I/O speed (getting data to and from the disk), which is the real nemesis of DBAs everywhere.

Link to comment
Share on other sites

Guest Anonymous

Re: Simple MySQL optimization

Currently, we have one project that is peaking at around 70 million hits per hour, which is effectively ~150 million queries per hour. Now obviously, this is no baby system, but it actually runs on myisam tables cleanly.

Unfortunately, there are some minor issues with it, and I am aiming to switch to innodb and using a master slave system (ideally, I'd use clustering, but I'm not sure the client will fork out for the 12 machines that produces the best availability and stability in a single data center).

Scalability has always been at the back of my mind, and this particular client has given me an insight into data center levels of traffic while using off-the-shelf components.

The code for this by the way, is plain old-fashioned PHP4 (or rather it was until I got my grubby mitts on it), but it's not that overly complex to handle that type of load with a little thinking.

As for rows, well, we generally keep around 6 months worth of data in the live system and archive the remainder out on a regular basis, without any real issues.

The trick, (I believe) is in understanding the underlying file-system, caching, and correct query optimization - most of which can be gleaned from RTFM.

Link to comment
Share on other sites

Re: Simple MySQL optimization

 

The trick, (I believe) is in understanding the underlying file-system, caching, and correct query optimization - most of which can be gleaned from RTFM.

When it comes to I/O issues, that's always the case. If your queries start slowing down during peak load, and you've done what you can optimizing indexes, queries, and query plans, then your next plan of attack is anything to speed up I/O.

We found that during our end-game routines, our player resource table was getting crushed due to disk speed - our first attempts were in the form of query optimizations, and then indexes. When these were no longer adequate, we moved on to separate spindles on faster disks (raid-1 mirrors on 15k SAS - one for logs, another for data - as is common practice). After these were shown to be too slow, we went to mounting our actual mysql file structure in memory via TEMPFS.

Now, our db keeps up with whatever we toss at it, but we're currently in process of planning future upgrades in which we will be moving to clustered situation in which there are multiple read and read/write members, each with a specific partition of the database (ie required replica tables for particular activies), all running directly out of memory.

Link to comment
Share on other sites

Re: Simple MySQL optimization

 

10/10 -- Nice dig...

This thread dates to August, not quite sure why it was resurrected. Anyways, Nyna and I were feuding, hence the nice dig comment.

That's pretty much all this was about. Clearly removing parens and relying strictly on order of operations is nothing to worry about.

 

However, I would make the argument that many times more parens actually make code less readable...

 

If you think ----

WHERE ((bankmoney >= 0) AND (fedjail <= 0) AND (last_login >= (UNIX_TIMESTAMP() - (3 * 86400))))

is more readable than ----

WHERE bankmoney >= 0 AND fedjail <= 0 AND last_login >= UNIX_TIMESTAMP() - 3 * 86400

Then there is nothing I can do for you.

There's only one set of parens I could see adding in, and only because one might want to clear up any order of operation confusion in noobs.

 

WHERE bankmoney >= 0 AND fedjail <= 0 AND last_login >= UNIX_TIMESTAMP() - (3 * 86400)

That IMHO is no more readable than the above example but is more dummy proof.

Link to comment
Share on other sites

Re: Simple MySQL optimization

 

That's pretty much all this was about. Clearly removing parens and relying strictly on order of operations is nothing to worry about.

The problem is when it comes to overriding the order of operations purposely, which your advice could cause other people to subvert unintentionally in trying to "optimize" their code (especially given the audience you're sending this to). For example:

WHERE ((bankmoney >= 0) OR (fedjail <= 0)) AND (last_login >= (UNIX_TIMESTAMP() - (3 * 86400))))

would return is entirely different (and drastically smaller) set of records than

WHERE bankmoney >= 0 OR fedjail <= 0 AND last_login >= UNIX_TIMESTAMP() - 3 * 86400

 

There's only one set of parens I could see adding in, and only because one might want to clear up any order of operation confusion in noobs.

That's the point though, people remember that parenthesis mean "do this first" if they remember nothing else from high-school math class, and if that makes it easier to understand, then that extra tiny bit of work mysql does is worth the legibility to both yourself and any other coders who may follow in your footsteps in spades.

But then again, my game is only on the top of this site's top game list, so your mileage may vary.

Link to comment
Share on other sites

Re: Simple MySQL optimization

So does this count for ` ?

e.g (Sorry for it being an extract out of index.php)

if(isset($_POST['NotesUpdate']))
{
$UpdatePad = sprintf(
"UPDATE `users` SET
	`user_notepad` = '%s'
WHERE `userid` = '%u'",
mysql_real_escape_string($_POST['NotesUpdate']),
$userid);
$db->query($UpdatePad);	
@header('Redirect: 1; index.php');
}
Link to comment
Share on other sites

Re: Simple MySQL optimization

Well...

if(isset($_POST['NotesUpdate']))
{
$UpdatePad = sprintf(
"UPDATE `users` SET
	`user_notepad` = '%s'
WHERE `userid` = '%u'",
mysql_real_escape_string($_POST['NotesUpdate']),
$userid);
$db->query($UpdatePad);	
@header('Redirect: 1; index.php');
}

 

would work the same without ` like:

 

if(isset($_POST['NotesUpdate']))
{
$UpdatePad = sprintf(
"UPDATE users SET
	user_notepad = '%s'
WHERE userid = '%u'",
mysql_real_escape_string($_POST['NotesUpdate']),
$userid);
$db->query($UpdatePad);	
@header('Redirect: 1; index.php');
}

 

So would this affect it or not?

Link to comment
Share on other sites

Re: Simple MySQL optimization

 

It's so much easier to understand without unnecessary brackets.

In his example or mine? If mine, you've helped illustrate my point, as the brackets weren't unnecessary, as the bracketed query would return FAR fewer results due to the execution plan.

Link to comment
Share on other sites

Re: Simple MySQL optimization

WHERE ((bankmoney >= 0) AND (fedjail <= 0) AND (last_login >= (UNIX_TIMESTAMP() - (3 * 86400))))

I was on about that. To take all the brackets in this case wouldn't be a good idea, because of how it would be read. If you think about "BODMAS" the outcomes with or without the brackets would be different.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...