Friday, July 11, 2008

MySQL: DIVide and Conquer

Everybody that has had to do some numeric calculations in SQL will have encountered this simple problem: divide an integer by another integer and round down the outcome to the nearest integer. In many cases people write it like this:

FLOOR(a/b)

Simple enough, right? First we do the division a/b, then we round down using the function FLOOR().


Update: My claim that TRUNCATE(a/b, 0) is equivalent to FLOOR(a/b) is false! It maybe true when the outcome of the division is a positive number, but in case of a negative number, TRUNCATE() will only lose the decimals (resulting in a higher negative number) and FLOOR() will still round down.

Thanks Kai!


However, there is a better way.

We can use the integer division operator DIV instead of the ordinary division operator /. Because this is an integer division, there is no need for an extra function to lose the decimals, and the expression is simply:

a DIV b

This approach has a number of advantages:

  • It is explicit. By looking at the expression we know immediately that the result will be an integer, and that a and b are meant to be integers too.

  • It is easier to read. Because we don't need another function and parenthesis, this expression is easier on the eyes, something that you will appreciate if the expression is not simply FLOOR(a/b) but something like FLOOR(SUM(a)/SUM(IFNULL(b,0)))

  • It is fast! The DIV operation does not have to deal with complex floating point math, and will be much faster on most microprocessors


To prove the last point, take a look at the results of a simple benchmark. I simply used the BENCHMARK() function and executed:

mysql> SELECT BENCHMARK(10000000,1234567 DIV 7) ;
+-----------------------------------+
| BENCHMARK(10000000,1234567 DIV 7) |
+-----------------------------------+
| 0 |
+-----------------------------------+
1 row in set (0.83 sec)

mysql> SELECT BENCHMARK(10000000,1234567 / 7) ;
+---------------------------------+
| BENCHMARK(10000000,1234567 / 7) |
+---------------------------------+
| 0 |
+---------------------------------+
1 row in set (7.26 sec)

mysql> SELECT BENCHMARK(10000000,FLOOR(1234567 / 7)) ;
+----------------------------------------+
| BENCHMARK(10000000,FLOOR(1234567 / 7)) |
+----------------------------------------+
| 0 |
+----------------------------------------+
1 row in set (8.80 sec)

I repeated this two more times and averaged the time spent, and then made this little graph of the results:
DIVideAndConquerThe results show that DIV is about 9 to 10 times faster than the ordinary division operator, and that adding FLOOR() function makes the entire expression another 10% slower.

Now, I don't think the performance benefit is of much practical significance. You may see a slight improvement for large datasets using multiple division operations, but in many cases the ordinary query processing will probably have a much larger part in the total time spent. But still, DIV is faster, easier to read and more explicit if you want to solve this type of problem.

4 comments:

Andrew Dashin said...

Hi,

Nice tip! Thank you.

Anonymous said...

Roland,

That's a fantastic tip! I've never come across that operator before.

I'm working on a very speed sensitive app that I've moved into memory tables to maximise speed.

I'm guessing that this will give me another boost.

Joe

rpbouman said...

Hi Andrew, Joe,

thanks!

Joe: Like I wrote in the article, I don't expect that much from it, so I wouldn't hold my breath if I were you. However if your application is such that it simply needs integer arithmetic, it certainly won't hurt to use DIV, and it may give you some gains.

It would be cool if you could benchmark your real-world case, and post back some figures. Of course, if you experience some speedup, it does not mean it will be generally applicable, but it would still be interesting to see how much or how little difference it makes for your particular case.

Thanks in advance,

Roland.

Anonymous said...

Great tip buddy.

Sam

DuckDB Bag of Tricks: Reading JSON, Data Type Detection, and Query Performance

DuckDB bag of tricks is the banner I use on this blog to post my tips and tricks about DuckDB . This post is about a particular challenge...