Archive for May, 2008

HSCALE 0.2 released and new project web page

Tuesday, May 6th, 2008

The main focus of version 0.2 was to improve handling of almost all of SQL. So now you can issue DESC TABLE tbl_name or RENAME TABLE tbl_name TO another_tbl_name on a partitioned table and you get correct results for SHOW TABLES etc.
Other statements are rejected and we settled down for the feature set we want to provide for full partition scans.
In addition to that there are some performance improvements.

See the full list of changes.

For the next release (0.3) we focus on the dictionary based partition lookup module and further performance improvements.

Finally, there is a new project home page: http://www.hscale.org.

Update: Benchmark HSCALE with MySQL Proxy 0.7.0 (svn) against 0.6.1

Monday, May 5th, 2008

Earlier today I posted these benchmark results testing HSCALE and MySQL Proxy performance.

As Jan Kneschke (the author of MySQL Proxy) pointed out there are quite some improvements in the current development version (svn trunk). So I gave revision 369 a try.

Tests were all the same as mentioned in my previous post. And indeed we see quite dramatic improvements. While the performance of the Lua script stayed almost the same the footprint of the proxy itself sank to only 50 to 65%. Here are the numbers:

Version / Concurrency MySQL MySQL Proxy Empty Lua Tokenizer QueryAnalyzer HSCALE w/o partitions HSCALE w/ partitions
0.6.1 / 40 217 1302 7667 7091 6162 7552 7577
0.6.1 / 20 217 557 2536 4532 4524 4325 4564
0.6.1 / 10 287 641 675 1179 1813 738 2711
0.6.1 / 1 1906 3914 4574 5299 5411 4465 6957
0.7.0 / 40 229 1061 5165 4786 5844 6163 5950
0.7.0 / 20 222 331 2553 1900 2968 3927 4074
0.7.0 / 10 297 489 499 930 1601 550 2413
0.7.0 / 1 1937 2895 2614 3814 4499 3235 5578

(all values are “time in ms”)

Looking at the highlighted rows (concurrency = 10) you see that the difference between the MySQL server and MySQL Proxy is much smaller for the svn version. This is a great step forward!

If you compare all the other numbers and relate them to the execution time of the MySQL Proxy you see that the overhead stayed pretty much the same. So we see great improvements in general footprint but not in Lua execution.

And still the test does not scale well beyond 10 parallel threads. As Kay Roepke (co-author of MySQL Proxy) pointed out MySQL Proxy is currently single threaded and thus improvements on this were not expected (but would have been fine ;) ).

I hope version 0.7.0 is to be released quite soon (last commit in SVN is from February??? according to http://svn.mysql.com/fisheye/browse/mysql-proxy) since the performance improvement is simply great and this would help MySQL Proxy gaining more acceptance as the “latency” is often the number one “reason” not to try out MySQL Proxy.

Benchmark MySQL Proxy and HSCALE

Monday, May 5th, 2008

As part of developing HSCALE, a partitioning / sharding solution, I set up a benchmark test suite. I made it scripted and thus repeatable to monitor the progress and performance regressions during the development.

Test Suite

The test suite uses mysqlslap to benchmark the overhead of MySQL Proxy itself in real life scenario as well as the different components of HSCALE - query analyzing and query rewriting. The complete test suite is available in the svn trunk at http://svn.hscale.org under hscale/test/performance/mysqlslap. There you find a build.xml - an Ant buildfile that is used to set up the test environment and perform the tests.

Test Strategy

There are several things we want to find out using this benchmark:

  1. How much overhead adds MySQL Proxy in a multiple server setup?
  2. Does using Lua scripts add substantial overhead?
  3. How much resources does the proxy.tokenizer use?
  4. How does HSCALE perform on unpartitioned tables?
  5. How does HSCALE perform on partitioned tables?

As stated above mysqlslap is used to generate multi-threaded load. mysqlslap is used to fire this statement:

SELECT
id, category
FROM small
WHERE
small.category='books'
/* Added */ /* some */ /* comments */
/* to */ /* produce */ /* a */ /* higher */
/* tokenizer */ /* load */

against this table and content:

CREATE TABLE small (
id INT UNSIGNED NOT NULL,
category ENUM('books', 'hardware', 'software') NOT NULL,
PRIMARY KEY(id)
) ENGINE=HEAP;

INSERT INTO small (id, category) VALUES (1, 'books');
INSERT INTO small (id, category) VALUES (2, 'hardware');
INSERT INTO small (id, category) VALUES (3, 'software');

Each run sends 10,000 queries to the MySQL Server or MySQL Proxy respectively.

Test Setup

  1. A MySQL server instance (5.0.54-enterprise-gpl-log) on a DELL PowerEdge 2850, 2xQuadCore 2.8GHz, 12GB RAM
  2. A server running MySQL Proxy (version 0.6.1) instances exclusively (DELL PowerEdge 2950, 2xQuadCore 2.33GHz, 8GB RAM)
  3. A test runner on a DELL PowerEdge 1950, 2xQuadCore 1.8GHz, 8GB RAM.

The test suite is totally CPU and memory bound so the IO system doesn’t matter here.

Results

benchmark_hscale_0.2_20080505

Concurrency MySQL MySQL Proxy Empty Lua Tokenizer QueryAnalyzer HSCALE w/o partitions HSCALE w/ partitions
40 217 1302 7667 7091 6162 7552 7577
20 217 557 2536 4532 4524 4325 4564
10 287 641 675 1179 1813 738 2711
1 1906 3914 4574 5299 5411 4465 6957

Each test means:

  1. MySQL: Test ran directly against a mysql server
  2. MySQL Proxy: Test ran directly against a MySQL Proxy server with no additional configuration / script
  3. Empty Lua: A Lua script with an empty function read_request(packet) has been used
  4. Tokenizer: Each query has been tokenized using proxy.tokenizer
  5. QueryAnalyzer: Tokenizer and query analyzer are used but no query rewriting
  6. HSCALE w/o partitions: HSCALE is used but the table is not partitioned
  7. HSCALE w/ partitions: HSCALE is used against a partitioned table

Conclusions

First of all: Please note that these benchmarks measure the maximum overhead of each component and that overhead is constant meaning that a statement that takes 1 minute to complete on the MySQL server does not take 2 minutes when using MySQL Proxy.

CPU As Limiting Factor

As you can see with a concurrency of 20 or more everything gets worse and worse. This is because the MySQL Proxy / Lua performance becomes CPU bound. In addition to that you can see that the time is spent anywhere but within the Lua scripts: While we see quite distinct performance values for lower concurrencies (HSCALE w/ and w/o partitions show a huge difference) every benchmarks takes almost the same time at 20 or 40 parallel threads.

Looking at top the MySQL Proxy seems to be using a single CPU out of 8 available. If this is the case it would be extremely desirable to have MySQL Proxy use all available resources.

MySQL Proxy Overhead

As we can see putting a plain MySQL Proxy between application and MySQL server adds about 100% to 150% to the average overall performance. This is what we could have expected because of the added latency - packets are going through 2 hops instead of 1.

With higher concurrency the overhead grows until it totally drops at 40 parallel threads. Here CPU seems to be the limiting factor.

Lua Scripts

Adding an empty Lua script to the configuration results in little overhead up to a concurrency of 10. With higher concurrency everything gets worse. Again CPU seems to be the limiting factor.

Tokenizer

The SQL tokenizer adds about 75% compared to an empty Lua script. So we should avoid it as much as we can. With the results of this benchmark we were able to improve the overall HSCALE performance for non-partitioned tables (see this Issue).

QueryAnalyzer

Since the QueryAnalyzer utilizes the tokenizer it implies its overhead and adds additional 50% (at a concurrency of 10). Here is a lot of room for improvement. Currently the analyzer is almost complete so we can concentrate on performance. First of all the algorithm could be optimized (anticipating the fastest path) and then more hinting could be added.

HSCALE w/o Partitions

After implementing an improvement for this Issue (avoiding tokenizer) we see that performance for queries against non-partitioned tables is almost as good as for empty Lua scripts.

HSCALE w/ Partitions

Looking at the concurrency level of 10 we see that HSCALE performs 10 times slower that the MySQL server and 5 times slower than an empty MySQL Proxy. Needless to say that this is quite a huge number. With performance improvements we might lower this to a factor of 2 or 3 times slower than MySQL Proxy itself. This is ok since we are still able to perform more than 3,000 statements / s. And finally we are able to use multiple proxies to spread the load.

Final Thoughts

This benchmark showed us mainly 3 things:

  1. MySQL Proxy adds the expected latency overhead - but not more. Average is about 0.035 milliseconds per query.
  2. Scaling of MySQL Proxy could be improved - using all CPUs
  3. HSCALE adds a maximum overhead of about 0.24 ms per query (against a partitioned table).

Please feel free to comment on the results or run the tests on your own.

UPDATE: Corrected the number of milliseconds MySQL Proxy and HSCALE add per query: Old were 0.35 ms for proxy and 2.4 ms for HSCALE. The correct numbers are 0.035 ms for proxy and 0.24 ms for HSCALE.