Posted by mkarg
on July 3, 2010 at 6:19 AM PDT
Finally iAnywhere answered my prayers and implemented support for JDBC batch mode into their latest (and fastest) driver. But at surprisingly bad performance. Let's see why.
In my opinion, SQL Anyhwere is the best RDBMS I can think of. I can remember when we started distributing it in Germany back in the early 1990ies, as one of the first early adopters in this country. Since then, we provided it to hundreds of enterprises, from single-person laptop-only ones to large ones spanning replicated installations crossing country borders. So call me biased in that point, but never did I wish to move on to a differed one, even now after decades. It's like a Volkswagen. It runs and runs and runs, and needs virtually no administration. Actually I love it.
But what I do not love is its performance. There are some things I wish could work just faster. But as it is impossible to get optimum performance with a zero-administration system obviously, one has to take care to do performance-friendly coding. There is much room in the JDBC API to squander performance - or to win some. One thing that will give a performance boost in some situations is batching processing: Adding several commands together into one large packet, getting all of them processed in one single step, will not only keep the LAN latency's influence on the transaction's execution time low by reducing the amount of LAN roundtrips (and such offers others the possibility to use the LAN in the freed time slices), it also speeds up own transaction speed by more likely keeping rows in the cache which would be otherwise swapped out possibly in the mean time, and it shortens the time database locks are blocking others from accessing (what in turn speeds up their transaction speed and such is "more social behaviour").
So batch processing is key for performance improvement, and the SQL Anywhere server product comes with this features for very long time. But not the JDBC driver. While JDBC actually has methods for building and executing batches (namely Statementl.addBatch(String), Statement.executeBatch() and Statement.clearBatch()), SQL Anywhere's very own JDBC driver didn't implement that methods so far. So if one wanted to use batches, there was no other way to glue the SQL commands together as an SQL string and pass it to the driver. Not very smart, and everything but portable, as the string had to include BEGIN and END statements to get parsed correctly.
Then came the day when I just found that enough is enough and asked Sybase to implement batch support into the driver. After surprisingly short discussion they actually added it and published it "silently" as part of the latest patch (EBF - Express Bug Fix). So when downloading the latest EBF for SQL Anywhere 11.0.1, you will get it without further note (just as you got the new driver with maintenance update 11.0.1, without any note in the manual, too - it still says no single word about this fast driver). I was happy and gave it a test - and was very unsatisfied with the resulting performance.
So here come the good news first. Yes, the EBF enables support of JDBC batch mode in the SQL Anyhwere 11.0.1 driver (remember, we're not talking about the iAnywhere or the jConnect drivers - the first is outdated and such not covered by my test, the latter already had batch mode support on board, but is just completely slow). One can use the aforementioned commands and they will just work as expected. The road is free for readable and portable code: No more need to glue together strings. No more? Wait!
The bad news is still to come. If you're a performance junkie like me, and this is typically why you actually want to use the batch mode, you still will have to live with the non-portable dirty string-glueing code. Because the batch mode is just horribly slow. The following performance comparison will show it. Ignore the numbers (millesconds actually) in their actual size, but concentrate on the dimensions solely:
- Compound Commands: 37ms
- Batch Mode: 112ms
Yes, this is not a typo! The batch mode actually is more than three times slower than glueing SQL strings together manually. In other words, if performance is your target, using batch mode instead of dirty glueing will reduce performance by 300%. Good lord! Why so?
To understand this result, one must see what the batch mode actually does: According to the JDBC specification it not just appends SQL statements, but also will return the number of actually modified rows. And there's the rub. To be able to return a list of results, the driver must collect them. And this just costs time. The more SQL statements in the batch, the longer the list of results to keep in memory til end of the batch, the higher the RAM consumption. It's just as easy as this. I couldn't believe above numbers at first, so I checked what would happen if I collect the results on my own, using a compound command built up of my actual SQLs plus a temporary table, which in the end will be more or less exactly what the driver would do. The result is frightening:
- Compound Commands: 37ms
- Batch Mode: 112ms
- Simulating batch using temporary table: 297ms
Simulating the batch mode's exact behaviour results in 300% worse performance. So it seems, the batch mode actually is much faster than compund commands are. The first test just was not fair, is the original compound commands were executed without getting back any result. Unfortunaltey, the JDBC API doesn't offer ignoring results with batches, so the driver always has to provide them, even if those are not used by the calling application.
What we can learn from this example is: If optimum performance is the target, then there is no general solution. Instead, the application programmer must choose different approaches for each situation. If results are not needed, batch mode should be avoided. But if the application needs results, batch mode is the method of choice. So in the end, the latest driver actually brings a tremendous performance impact in both directions - 300% gain when results are needed, 300% leakage when results are not needed.
Challenged by this insights, I tried more ways to execute lots of commands and did some performance measurements. Stay tuned for the next article telling the actual speed of prepared statements, result set update mode, and stored procedures. If you liked this article, you will love the next one.
For those interested in the actual source code of the batch simulation, here is a small code snippet that will show the overall idea
StringBuilder compoundSql = new StringBuilder("BEGIN DECLARE LOCAL TEMPORARY TABLE T (r INTEGER); ");
for (String singleSql : allSqlCommands)
compoundSql.append(singleSql).append("; INSERT INTO T SELECT @@rowcount; ", sql));
compoundSql.append("SELECT r FROM T; END;");
An overview of all my recent publications can be found on my personal web site Head Crashing Informatics .