Thursday, March 26, 2009

More fun visualizing MySQL Database Size

Hi again!

As a sidekick for my previous post, I came up with a snippet of code that generates the Google Chart URL to visualize table size for the current database. For example, for the sakila sample database, we get URL's like this:
http://chart.apis.google.com/chart?cht=bhs&chbh=19,2,2&chs=653x459&chtt=sakila%20Size%20(MB)&chco=4D89F9,C6D9FD&chd=t:0.0156,0.0156,0.0156,0.0156,0.0156,0.0469,0.0625,0.0625,0.0781,0.0781,0.1875,0.1875,0.1875,0.1163,0.1719,1.5156,1.5156|0.0000,0.0000,0.0000,0.0156,0.0313,0.0156,0.0156,0.0313,0.0156,0.0469,0.0781,0.0781,0.0781,0.2002,0.1875,0.6094,1.2031&chds=0,2.7188&chxt=y,x&chxl=0:|rental%20(InnoDB)|payment%20(InnoDB)|inventory%20(InnoDB)|film_text%20(MyISAM)|films%20(InnoDB)|film_actor%20(InnoDB)|film%20(InnoDB)|customer%20(InnoDB)|staff%20(InnoDB)|address%20(InnoDB)|film_category%20(InnoDB)|city%20(InnoDB)|store%20(InnoDB)|actor%20(InnoDB)|language%20(InnoDB)|country%20(InnoDB)|category%20(InnoDB)|1:|0|2.72MB&chm=N*f2*,000000,0,-1,11|N*f2*,000000,1,-1,11
The graph looks like this: Google Chart: Sakila database sizeHere's the script I used:
SET @maxpixels:=300000;
SET @barwidth:=19;
SET @spacebetweenbars:=2;
SET @spacebetweengroups:=2;
SET @totalbarwidth:=@barwidth+(2*@spacebetweenbars)+(2*@spacebetweengroups);
SET @megabytes:=1024*1024;
SET @decimals:=2;

SELECT CONCAT(
'http://chart.apis.google.com/chart'
, '?cht=bhs'
, '&chbh=',@barwidth,',',@spacebetweenbars,',',@spacebetweengroups
, '&chs=', @maxpixels div (COUNT(*) * @totalbarwidth),'x', COUNT(*) * @totalbarwidth
, '&chtt=', table_schema, ' Size (MB)'
, '&chco=4D89F9,C6D9FD'
, '&chd=t:', GROUP_CONCAT(data_length / @megabytes ORDER BY (data_length+index_length))
, '|', GROUP_CONCAT(index_length / @megabytes ORDER BY (data_length+index_length))
, '&chds=' ,0, ',', MAX(data_length+index_length)/@megabytes
, '&chxt=y,x'
, '&chxl=0:|', GROUP_CONCAT(table_name, ' (', engine,')' ORDER BY (data_length + index_length) DESC separator '|')
, '|1:|', 0, '|', ROUND(MAX(data_length+index_length) / @megabytes, @decimals), 'MB'
, '&chm=N*f',@decimals,'*,000000,0,-1,11|N*f',@decimals,'*,000000,1,-1,11'
)
FROM information_schema.tables
WHERE table_schema = SCHEMA()
AND table_type = 'BASE TABLE'
GROUP BY table_schema
I'm not really satisfied yet...I keep hitting limitations w/re to google charts. I built a little bit of logic that will ensure the resulting picture is within the upper limit of 300000 pixels. I just found out there is another limitation that says the chart height can't exceed 1000 pixels. I'm going to stop looking at this for a while, and maybe later on I will come up with some works-most-of-the-time kind of logic to control this.

Feel free to drop me a line if you have some ideas regarding this.

20 comments:

Ronald Bradford said...

Not bad. The limitations are a little annoying.
I also would like to see multi point line charts.

@insan3 said...

nice piece of sql Roland, i can see a 'GET' maximum LENGTH limitation.

@insan3 said...

nice piece of sql Rolland , neverthless the max size of the querystring could be a problem :( .

thanks.

rpbouman said...

Hi Ronald, insane! Thanks for posting a comment ;)

Yeah, I agree with both of you. I wrote in my previous post that I would never consider google charts for a serious db monitoring solution because of the obvious limitations.

Personally, I think Google Charts has a great feature ahead of it for showing online poll results - that kind of thing. For more serious charting there are already good solutions available, but I do think the learning curve for those is a bit steeper.

Anonymous said...

Hi Roland,

Overall URL length can easily grow to thousands of characters for common databases. I wonder how easy it would be to "partition" the graph into several images, which can then be combined to form the desired image.
I'll give it a thought, though obviously this is a big patch for a very simple problem (which Google can solve by allowing POST method, which they don't).

rpbouman said...

Hi Shlomi!

partitioning is a nice idea, I think that would be possible with some effort.

POST would not work, I mean, it would get you the image, but I think the <img> tag and thinks like CSS background-image all use GET anyway. So POST would only allow you to download the image, you would still need to hostit somewhere and i think that at that point you are better of using one of the many pother

Shlomi Noach said...

Hi Roland,

Changing from textual format to simple format may significantly reduce URL length. Instead of:
&chd=t:0,27,26,25,60,61
it's possible to write:
&chd=s:AaZY89

See:
http://code.google.com/apis/chart/formats.html#simple

Shlomi

rpbouman said...

Shlomi, thanks. I did notice this fomat, but other silly limitations (number of pixels) just made me stop looking into it further.

Shlomi Noach said...

Update: Google charts now supports POST!
http://code.google.com/apis/chart/docs/post_requests.html

Feb. 2010

rpbouman said...

Thanks Shlomi,

yeah I just noticed that the other day. Still, there are so many js visualization libs around, and most of them don't require you to post your data to a server at all. I'm going to go with those.

Shlomi Noach said...

Indeed, there are many JS solution. Do you have any recommendation?

rpbouman said...

Shlomi, flot is pretty good. Depends on jQuery, and if you need IE support, you need canvas emulation too (i can dig up where to find that but I am sure its in the flot documentation too)

Shlomi Noach said...

I'm looking into a more lightweight solution: http://danvk.org/dygraphs

These only provide time-series charts, which is all I need. The good part is that it's a single 45KB include which can easily be embedded in any HTML page.
flot requires a lot of dependencies, if I'm not mistaken.

Both are BSD/MIT, which is very good.

Thanks

rpbouman said...

That's a nice one too. Thanks!

As for lightweight and dependencies: I guess it depends on how you look at it. This dygraphs lib is 46kb, which is pretty good for what it provides. But flot.pack.js is 32 kb, a bit smaller still.

Granted, flot depends on jQuery, so we should add jQuery's weight to get an absolute weight. However, popular wisdom dictates that you should not host your own jQuery, instead, include one from google (see http://code.google.com/apis/ajaxlibs/documentation/#jquery). The idea is that more and more apps use this, ensuring it's already in the browser cache. (this is assuming that download time is the largest contributor to page slowness)

Anyway - I don't think it is going to matter a lot either way, but this sure looks like a worthwile alternative to flot, so thanks for pointing it out.

rpbouman said...

Oh, I should add - both flot and dygraphs both requre excanvas in case you care for IE compatibility.

Shlomi Noach said...

Thanks.
I'm very interested in a self contained solution.
Including jquery from some common web server (like google) is not a problem, really.

Both solutions offer far more ease of use than using Google Charts, which I'm doing now.

Shlomi

rpbouman said...

Shlomi, well if only need the time charts, and you want it to be self contained, I'd go for dycharts too. Thanks again for the tip.

Shlomi Noach said...

Still reviving this post ;)

Take a look at this
sample page, which uses my own openark-charts; they make for a very low footprint (15K), and can read basic Google-API urls.

rpbouman said...

Hi Shlomi!

yeah this looks really nice, good job!

Looks like you're using your own graph solution now? It's really impressive you got it down to 15k, good job :) Can you do Pie charts too, or is it mostly time-series based? (not that that is a problem, just curious...)

Shlomi Noach said...

Well, I got it down to 15K because it's not a general purpose solution. I've only coded what features I need for mycheckpoint.
BTW, the code could have been much smaller if I didn't insist of long, readable names for functions and class variables, which don't get minified.

I do not have pie charts (yet?), but have scatter plots:
http://code.openark.org/forge/wp-content/uploads/2010/09/r190/mcp_sql00/sv_report_html_24_7.html

But these are actually two different pieces of code.


(PS I think a comment I've posted a couple of weeks ago has gone lost)

DuckDB Bag of Tricks: Reading JSON, Data Type Detection, and Query Performance

DuckDB bag of tricks is the banner I use on this blog to post my tips and tricks about DuckDB . This post is about a particular challenge...