Floydian Posted August 3, 2008 Posted August 3, 2008 It's a small optimization, but something I see in a lot of noobish code... Take for instance, a query like this: UPDATE users SET bankmoney = bankmoney + 10000 WHERE ((bankmoney >= 0) AND (fedjail <= 0) AND (last_login >= (UNIX_TIMESTAMP() - (3 * 86400)))) The MySQL manual states the following about optimization of the where clause: Removal of unnecessary parentheses A query like the one above could be written like this: UPDATE users SET bankmoney = bankmoney + 10000 WHERE bankmoney >= 0 AND fedjail <= 0 AND last_login >= UNIX_TIMESTAMP() - 3 * 86400 All those extra parenthesis don't actually do anything for this query. The one tricky part is at the end. UNIX_TIMESTAMP() - 3 * 86400 Because MySQL follows mathematical order of operations 3 is multiplied by 86400 first, and then that value is subtracted from the current unix timestamp. Thus, even the parenthesis used to force execution order there are unnecessary. I hope this helps. Naturally, this is one of those deals that is akin to the difference of print and echo. I wouldn't go through thousands of lines of code removing parenthesis, but I certainly wouldn't write new code that was bogged down with them either. Happy MySQL'in folks! And stay tuned for the next installment of the Horizons Workshop! Quote
Isomerizer Posted August 3, 2008 Posted August 3, 2008 Re: Simple MySQL optimization Thanks, never knew it made a difference, even if it is a small one. Quote
Floydian Posted August 3, 2008 Author Posted August 3, 2008 Re: Simple MySQL optimization Sure thing ;) I've read many parts of the manuals more times than I care to remember. lol, I've even read many of the comments people post. There's some fantastic stuff in there for those that want to get some more optimization techniques straight from the developers. :) Quote
Guest Anonymous Posted August 3, 2008 Posted August 3, 2008 Re: Simple MySQL optimization 10/10 -- Nice dig... Whilst you may of course be correct semantically, optimizations at this level, especially given the query as presented, are in general considered useless. Consider, if you will, the actual conditional part: #1 bankmoney >= 0 #2 fedjail <= 0 #3 last_login >= UNIX_TIMESTAMP() - 3 * 86400 Now while each of these parts will limit the amount of rows searched, exactly how does this happen? Without any indices, there is a complete row-scan, something that is not optimal. With indicies on the relevant fields : bankmoney, fedjail, and last_login; you will have a *much* better optimization than basic parenthesis removal. There's also the case to consider of calling the function UNIX_TIMESTAMP(). Passing in the actual (current) time - 3 days would result in a slight performance increase as the mathematic parsing routines of the SQL interpreter are not so heavily used. And finally, whilst Floydian is roughly correct here, there is a case *for* using parenthesis as it can become easier to read, is certainly easier to maintain, and somewhat easier to test sub expressions. At the end of the day -- relying on comments on optimization without understanding the underlying data-definition is risky. Judicious use of the EXPLAIN keyword (and perhaps a true query performance analyzer) will help you with *your* project. Of course, I accept that a lot of people just don't have the experience to look at a query and a data definition schema in order to improve its level of optimization, but with time, they may learn. Reading the manual thoroughly is possibly the key here, but also listening to others helps. Quote
Saha Posted January 27, 2009 Posted January 27, 2009 Re: Simple MySQL optimization 10/10 -- Nice dig... Whilst you may of course be correct semantically, optimizations at this level, especially given the query as presented, are in general considered useless. Consider, if you will, the actual conditional part: #1 bankmoney >= 0 #2 fedjail <= 0 #3 last_login >= UNIX_TIMESTAMP() - 3 * 86400 Now while each of these parts will limit the amount of rows searched, exactly how does this happen? Without any indices, there is a complete row-scan, something that is not optimal. With indicies on the relevant fields : bankmoney, fedjail, and last_login; you will have a *much* better optimization than basic parenthesis removal. There's also the case to consider of calling the function UNIX_TIMESTAMP(). Passing in the actual (current) time - 3 days would result in a slight performance increase as the mathematic parsing routines of the SQL interpreter are not so heavily used. And finally, whilst Floydian is roughly correct here, there is a case *for* using parenthesis as it can become easier to read, is certainly easier to maintain, and somewhat easier to test sub expressions. At the end of the day -- relying on comments on optimization without understanding the underlying data-definition is risky. Judicious use of the EXPLAIN keyword (and perhaps a true query performance <censored>yzer) will help you with *your* project. Of course, I accept that a lot of people just don't have the experience to look at a query and a data definition schema in order to improve its level of optimization, but with time, they may learn. Reading the manual thoroughly is possibly the key here, but also listening to others helps. Thanx Nyna and thanks Floydian i learn alot from just reading all your posts ;) Quote
CtrlFreq Posted January 29, 2009 Posted January 29, 2009 Re: Simple MySQL optimization It's a small optimization, but something I see in a lot of noobish code... I'd imagine you see it there because it's also in the high-end, professionally-authored code available in books and tutorials they've either learned or cut & paste from. The reason it's done is for readability, which is much more important overall than trying to optimize the already extremely tight order of operation handling mechanisms in mysql. If you're really wanting to optimize that particular process, there are some very real things you should be doing: First and foremost, the engine; is this table insert and read only, or are there updates? If there are updates, you should be creating the table using the InnoDB engine, which allows for row-level locking. Next, how many records are in the table? More than a couple thousand and your performance goes to hell unless you've got indexes in place, and THAT is where optimizations of the query, and the query plan really begin to matter. Beyond this, you get to the real meat of the issue, which is I/O speed (getting data to and from the disk), which is the real nemesis of DBAs everywhere. Quote
Guest Anonymous Posted January 29, 2009 Posted January 29, 2009 Re: Simple MySQL optimization Currently, we have one project that is peaking at around 70 million hits per hour, which is effectively ~150 million queries per hour. Now obviously, this is no baby system, but it actually runs on myisam tables cleanly. Unfortunately, there are some minor issues with it, and I am aiming to switch to innodb and using a master slave system (ideally, I'd use clustering, but I'm not sure the client will fork out for the 12 machines that produces the best availability and stability in a single data center). Scalability has always been at the back of my mind, and this particular client has given me an insight into data center levels of traffic while using off-the-shelf components. The code for this by the way, is plain old-fashioned PHP4 (or rather it was until I got my grubby mitts on it), but it's not that overly complex to handle that type of load with a little thinking. As for rows, well, we generally keep around 6 months worth of data in the live system and archive the remainder out on a regular basis, without any real issues. The trick, (I believe) is in understanding the underlying file-system, caching, and correct query optimization - most of which can be gleaned from RTFM. Quote
CtrlFreq Posted January 29, 2009 Posted January 29, 2009 Re: Simple MySQL optimization The trick, (I believe) is in understanding the underlying file-system, caching, and correct query optimization - most of which can be gleaned from RTFM. When it comes to I/O issues, that's always the case. If your queries start slowing down during peak load, and you've done what you can optimizing indexes, queries, and query plans, then your next plan of attack is anything to speed up I/O. We found that during our end-game routines, our player resource table was getting crushed due to disk speed - our first attempts were in the form of query optimizations, and then indexes. When these were no longer adequate, we moved on to separate spindles on faster disks (raid-1 mirrors on 15k SAS - one for logs, another for data - as is common practice). After these were shown to be too slow, we went to mounting our actual mysql file structure in memory via TEMPFS. Now, our db keeps up with whatever we toss at it, but we're currently in process of planning future upgrades in which we will be moving to clustered situation in which there are multiple read and read/write members, each with a specific partition of the database (ie required replica tables for particular activies), all running directly out of memory. Quote
Floydian Posted January 29, 2009 Author Posted January 29, 2009 Re: Simple MySQL optimization 10/10 -- Nice dig... This thread dates to August, not quite sure why it was resurrected. Anyways, Nyna and I were feuding, hence the nice dig comment. That's pretty much all this was about. Clearly removing parens and relying strictly on order of operations is nothing to worry about. However, I would make the argument that many times more parens actually make code less readable... If you think ---- WHERE ((bankmoney >= 0) AND (fedjail <= 0) AND (last_login >= (UNIX_TIMESTAMP() - (3 * 86400)))) is more readable than ---- WHERE bankmoney >= 0 AND fedjail <= 0 AND last_login >= UNIX_TIMESTAMP() - 3 * 86400 Then there is nothing I can do for you. There's only one set of parens I could see adding in, and only because one might want to clear up any order of operation confusion in noobs. WHERE bankmoney >= 0 AND fedjail <= 0 AND last_login >= UNIX_TIMESTAMP() - (3 * 86400) That IMHO is no more readable than the above example but is more dummy proof. Quote
CtrlFreq Posted January 29, 2009 Posted January 29, 2009 Re: Simple MySQL optimization That's pretty much all this was about. Clearly removing parens and relying strictly on order of operations is nothing to worry about. The problem is when it comes to overriding the order of operations purposely, which your advice could cause other people to subvert unintentionally in trying to "optimize" their code (especially given the audience you're sending this to). For example: WHERE ((bankmoney >= 0) OR (fedjail <= 0)) AND (last_login >= (UNIX_TIMESTAMP() - (3 * 86400)))) would return is entirely different (and drastically smaller) set of records than WHERE bankmoney >= 0 OR fedjail <= 0 AND last_login >= UNIX_TIMESTAMP() - 3 * 86400 There's only one set of parens I could see adding in, and only because one might want to clear up any order of operation confusion in noobs. That's the point though, people remember that parenthesis mean "do this first" if they remember nothing else from high-school math class, and if that makes it easier to understand, then that extra tiny bit of work mysql does is worth the legibility to both yourself and any other coders who may follow in your footsteps in spades. But then again, my game is only on the top of this site's top game list, so your mileage may vary. Quote
POG1 Posted January 29, 2009 Posted January 29, 2009 Re: Simple MySQL optimization It's so much easier to understand without unnecessary brackets. Quote
Karlos Posted January 29, 2009 Posted January 29, 2009 Re: Simple MySQL optimization So does this count for ` ? e.g (Sorry for it being an extract out of index.php) if(isset($_POST['NotesUpdate'])) { $UpdatePad = sprintf( "UPDATE `users` SET `user_notepad` = '%s' WHERE `userid` = '%u'", mysql_real_escape_string($_POST['NotesUpdate']), $userid); $db->query($UpdatePad); @header('Redirect: 1; index.php'); } Quote
POG1 Posted January 29, 2009 Posted January 29, 2009 Re: Simple MySQL optimization That is used for names (i think). phpmyadmin uses it with the table names so i guess it is required. Quote
Karlos Posted January 29, 2009 Posted January 29, 2009 Re: Simple MySQL optimization Well... if(isset($_POST['NotesUpdate'])) { $UpdatePad = sprintf( "UPDATE `users` SET `user_notepad` = '%s' WHERE `userid` = '%u'", mysql_real_escape_string($_POST['NotesUpdate']), $userid); $db->query($UpdatePad); @header('Redirect: 1; index.php'); } would work the same without ` like: if(isset($_POST['NotesUpdate'])) { $UpdatePad = sprintf( "UPDATE users SET user_notepad = '%s' WHERE userid = '%u'", mysql_real_escape_string($_POST['NotesUpdate']), $userid); $db->query($UpdatePad); @header('Redirect: 1; index.php'); } So would this affect it or not? Quote
Haunted Dawg Posted January 29, 2009 Posted January 29, 2009 Re: Simple MySQL optimization Not really. Why? Because it specify's which field and which table is must update. Quote
Karlos Posted January 29, 2009 Posted January 29, 2009 Re: Simple MySQL optimization Just curiosity really. Just an idea I never though of asking but this seemed like the right thread so I thought I'll have ago Quote
CtrlFreq Posted January 29, 2009 Posted January 29, 2009 Re: Simple MySQL optimization It's so much easier to understand without unnecessary brackets. In his example or mine? If mine, you've helped illustrate my point, as the brackets weren't unnecessary, as the bracketed query would return FAR fewer results due to the execution plan. Quote
POG1 Posted January 29, 2009 Posted January 29, 2009 Re: Simple MySQL optimization WHERE ((bankmoney >= 0) AND (fedjail <= 0) AND (last_login >= (UNIX_TIMESTAMP() - (3 * 86400)))) I was on about that. To take all the brackets in this case wouldn't be a good idea, because of how it would be read. If you think about "BODMAS" the outcomes with or without the brackets would be different. Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.