Archive for the 'mysql' Category

HSCALE 0.2 released and new project web page

Tuesday, May 6th, 2008

The main focus of version 0.2 was to improve handling of almost all of SQL. So now you can issue DESC TABLE tbl_name or RENAME TABLE tbl_name TO another_tbl_name on a partitioned table and you get correct results for SHOW TABLES etc.
Other statements are rejected and we settled down for the feature set we want to provide for full partition scans.
In addition to that there are some performance improvements.

See the full list of changes.

For the next release (0.3) we focus on the dictionary based partition lookup module and further performance improvements.

Finally, there is a new project home page: http://www.hscale.org.

Update: Benchmark HSCALE with MySQL Proxy 0.7.0 (svn) against 0.6.1

Monday, May 5th, 2008

Earlier today I posted these benchmark results testing HSCALE and MySQL Proxy performance.

As Jan Kneschke (the author of MySQL Proxy) pointed out there are quite some improvements in the current development version (svn trunk). So I gave revision 369 a try.

Tests were all the same as mentioned in my previous post. And indeed we see quite dramatic improvements. While the performance of the Lua script stayed almost the same the footprint of the proxy itself sank to only 50 to 65%. Here are the numbers:

Version / Concurrency MySQL MySQL Proxy Empty Lua Tokenizer QueryAnalyzer HSCALE w/o partitions HSCALE w/ partitions
0.6.1 / 40 217 1302 7667 7091 6162 7552 7577
0.6.1 / 20 217 557 2536 4532 4524 4325 4564
0.6.1 / 10 287 641 675 1179 1813 738 2711
0.6.1 / 1 1906 3914 4574 5299 5411 4465 6957
0.7.0 / 40 229 1061 5165 4786 5844 6163 5950
0.7.0 / 20 222 331 2553 1900 2968 3927 4074
0.7.0 / 10 297 489 499 930 1601 550 2413
0.7.0 / 1 1937 2895 2614 3814 4499 3235 5578

(all values are “time in ms”)

Looking at the highlighted rows (concurrency = 10) you see that the difference between the MySQL server and MySQL Proxy is much smaller for the svn version. This is a great step forward!

If you compare all the other numbers and relate them to the execution time of the MySQL Proxy you see that the overhead stayed pretty much the same. So we see great improvements in general footprint but not in Lua execution.

And still the test does not scale well beyond 10 parallel threads. As Kay Roepke (co-author of MySQL Proxy) pointed out MySQL Proxy is currently single threaded and thus improvements on this were not expected (but would have been fine ;) ).

I hope version 0.7.0 is to be released quite soon (last commit in SVN is from February??? according to http://svn.mysql.com/fisheye/browse/mysql-proxy) since the performance improvement is simply great and this would help MySQL Proxy gaining more acceptance as the “latency” is often the number one “reason” not to try out MySQL Proxy.

Benchmark MySQL Proxy and HSCALE

Monday, May 5th, 2008

As part of developing HSCALE, a partitioning / sharding solution, I set up a benchmark test suite. I made it scripted and thus repeatable to monitor the progress and performance regressions during the development.

Test Suite

The test suite uses mysqlslap to benchmark the overhead of MySQL Proxy itself in real life scenario as well as the different components of HSCALE - query analyzing and query rewriting. The complete test suite is available in the svn trunk at http://svn.hscale.org under hscale/test/performance/mysqlslap. There you find a build.xml - an Ant buildfile that is used to set up the test environment and perform the tests.

Test Strategy

There are several things we want to find out using this benchmark:

  1. How much overhead adds MySQL Proxy in a multiple server setup?
  2. Does using Lua scripts add substantial overhead?
  3. How much resources does the proxy.tokenizer use?
  4. How does HSCALE perform on unpartitioned tables?
  5. How does HSCALE perform on partitioned tables?

As stated above mysqlslap is used to generate multi-threaded load. mysqlslap is used to fire this statement:

SELECT
id, category
FROM small
WHERE
small.category='books'
/* Added */ /* some */ /* comments */
/* to */ /* produce */ /* a */ /* higher */
/* tokenizer */ /* load */

against this table and content:

CREATE TABLE small (
id INT UNSIGNED NOT NULL,
category ENUM('books', 'hardware', 'software') NOT NULL,
PRIMARY KEY(id)
) ENGINE=HEAP;

INSERT INTO small (id, category) VALUES (1, 'books');
INSERT INTO small (id, category) VALUES (2, 'hardware');
INSERT INTO small (id, category) VALUES (3, 'software');

Each run sends 10,000 queries to the MySQL Server or MySQL Proxy respectively.

Test Setup

  1. A MySQL server instance (5.0.54-enterprise-gpl-log) on a DELL PowerEdge 2850, 2xQuadCore 2.8GHz, 12GB RAM
  2. A server running MySQL Proxy (version 0.6.1) instances exclusively (DELL PowerEdge 2950, 2xQuadCore 2.33GHz, 8GB RAM)
  3. A test runner on a DELL PowerEdge 1950, 2xQuadCore 1.8GHz, 8GB RAM.

The test suite is totally CPU and memory bound so the IO system doesn’t matter here.

Results

benchmark_hscale_0.2_20080505

Concurrency MySQL MySQL Proxy Empty Lua Tokenizer QueryAnalyzer HSCALE w/o partitions HSCALE w/ partitions
40 217 1302 7667 7091 6162 7552 7577
20 217 557 2536 4532 4524 4325 4564
10 287 641 675 1179 1813 738 2711
1 1906 3914 4574 5299 5411 4465 6957

Each test means:

  1. MySQL: Test ran directly against a mysql server
  2. MySQL Proxy: Test ran directly against a MySQL Proxy server with no additional configuration / script
  3. Empty Lua: A Lua script with an empty function read_request(packet) has been used
  4. Tokenizer: Each query has been tokenized using proxy.tokenizer
  5. QueryAnalyzer: Tokenizer and query analyzer are used but no query rewriting
  6. HSCALE w/o partitions: HSCALE is used but the table is not partitioned
  7. HSCALE w/ partitions: HSCALE is used against a partitioned table

Conclusions

First of all: Please note that these benchmarks measure the maximum overhead of each component and that overhead is constant meaning that a statement that takes 1 minute to complete on the MySQL server does not take 2 minutes when using MySQL Proxy.

CPU As Limiting Factor

As you can see with a concurrency of 20 or more everything gets worse and worse. This is because the MySQL Proxy / Lua performance becomes CPU bound. In addition to that you can see that the time is spent anywhere but within the Lua scripts: While we see quite distinct performance values for lower concurrencies (HSCALE w/ and w/o partitions show a huge difference) every benchmarks takes almost the same time at 20 or 40 parallel threads.

Looking at top the MySQL Proxy seems to be using a single CPU out of 8 available. If this is the case it would be extremely desirable to have MySQL Proxy use all available resources.

MySQL Proxy Overhead

As we can see putting a plain MySQL Proxy between application and MySQL server adds about 100% to 150% to the average overall performance. This is what we could have expected because of the added latency - packets are going through 2 hops instead of 1.

With higher concurrency the overhead grows until it totally drops at 40 parallel threads. Here CPU seems to be the limiting factor.

Lua Scripts

Adding an empty Lua script to the configuration results in little overhead up to a concurrency of 10. With higher concurrency everything gets worse. Again CPU seems to be the limiting factor.

Tokenizer

The SQL tokenizer adds about 75% compared to an empty Lua script. So we should avoid it as much as we can. With the results of this benchmark we were able to improve the overall HSCALE performance for non-partitioned tables (see this Issue).

QueryAnalyzer

Since the QueryAnalyzer utilizes the tokenizer it implies its overhead and adds additional 50% (at a concurrency of 10). Here is a lot of room for improvement. Currently the analyzer is almost complete so we can concentrate on performance. First of all the algorithm could be optimized (anticipating the fastest path) and then more hinting could be added.

HSCALE w/o Partitions

After implementing an improvement for this Issue (avoiding tokenizer) we see that performance for queries against non-partitioned tables is almost as good as for empty Lua scripts.

HSCALE w/ Partitions

Looking at the concurrency level of 10 we see that HSCALE performs 10 times slower that the MySQL server and 5 times slower than an empty MySQL Proxy. Needless to say that this is quite a huge number. With performance improvements we might lower this to a factor of 2 or 3 times slower than MySQL Proxy itself. This is ok since we are still able to perform more than 3,000 statements / s. And finally we are able to use multiple proxies to spread the load.

Final Thoughts

This benchmark showed us mainly 3 things:

  1. MySQL Proxy adds the expected latency overhead - but not more. Average is about 0.035 milliseconds per query.
  2. Scaling of MySQL Proxy could be improved - using all CPUs
  3. HSCALE adds a maximum overhead of about 0.24 ms per query (against a partitioned table).

Please feel free to comment on the results or run the tests on your own.

UPDATE: Corrected the number of milliseconds MySQL Proxy and HSCALE add per query: Old were 0.35 ms for proxy and 2.4 ms for HSCALE. The correct numbers are 0.035 ms for proxy and 0.24 ms for HSCALE.

Presentation Slides: Introduction to HSCALE

Tuesday, April 15th, 2008

No, these slides are not fresh from the User Conference in Santa Clara… ;)

Today, I held a presentation in front of all developers and support engineers of our technical department about database partitioning, MySQL Proxy, HSCALE and the progress we are making.

Download the presentation slides here.

HSCALE 0.1 released - Partitioning Using MySQL Proxy

Thursday, April 10th, 2008

As written here and here I’ve been working on a MySQL Proxy Lua module that transparently splits up tables into multiple partitions and rewriting all queries to go to the right partition.

I finally got everything together to release a 0.1 version. Go on and download, try and read more about HSCALE 0.1.

All this started out as a prototype just to see if it could be done. And after adopting parts of our main product to use partitions via HSCALE + MySQL Proxy (which was an easy task, we just had to rewrite a few out of hundreds of statements) I really think that this could work out in a larger scale.

What Will Come Next?

Just a few notes on what I am working on right now:

Project Page And Issue Tracker

In a few days there will be a “real” project page with more documentation and an issue tracker ready. Since we already have both in use internally this should be an easy task.

Write Another Partition Lookup Module

A partition lookup module decides the partition(s) to use for a particular query. In the current release there is only a ModulusPartitionLookup integrated. Since the partition lookup module is pluggable it is easy to write other modules doing other things. The main focus now is to implement a DictionaryLookupService which will store the information of which partition is where inside the database. This allows you to add and move partitions “on the fly”. At the end you just have more control over the partition scheme.

Along with the new partition lookup module there will be more administrative SQL commands like:
HSCALE ADD PARTITION ..., HSCALE MOVE PARTITION ... and so on.

Full Partition Scans For Queries With Multiple Partitioned Tables

In the current version HSCALE is already capable of performing full partition scans for queries that don’t use the partition column and thus don’t provide a partition key like:

SELECT * FROM my_partitioned_table;

(Just a side note: Results returned from this query are not in natural order due to the fact that the data is spread over multiple tables. Thus your application cannot rely on the natural order for statements against partitioned tables (if full partition scan is performed). You should not rely on natural order anyway.)

Even though you should avoid full partition scans where you can sometime you just have to look into every partition. And even worse sometimes you join multiple partitioned tables like:

SELECT * FROM my_partitioned_table LEFT JOIN my_other_partitioned_table ON …;

Currently HSCALE rejects queries of this kind. In future it will join every partition of my_partitioned_table with every partition of my_other_partitioned_table. In most cases this is evil but sometimes you just have to.

Finally it is up to the partition lookup module to find out the combinations of partitions to use so we can optimize here for tables that use the same partitioning scheme.

Performance Profiling And Optimization

I was really astonished by the performance of the Lua scripting inside MySQL Proxy. I was able to analyze more than 100,000 statements in just a few seconds (without network overhead). This is already pretty good but can be improved. First of all I will have to find out the performance patterns to be used when scripting with Lua like “Is it better to inline functions?”, “Does ‘OO’ hurt?” and so on. Then performance tests analyzing both, speed and memory consumption, will have to be implemented to see if there is progression.

Distribute Partitions Across Multiple MySQL Servers

Right now we need to just split up huge tables but later on we want to distribute partitions over multiple MySQL server instances to have real horizontal scale out. The hardest part will be dealing with transactions where we have to use distributed transactions (XA) or disallow transactions involving partitions on different hosts. The latter one works well for parts of (our) application since they just don’t use transactions. Other parts will have to use XA. At this point I am not sure about the overhead XA will add but this has to be worked out once we come to this.

So, any feedback is welcome!

MySQL Proxy Lua scripts cannot be written as a module?

Saturday, April 5th, 2008

Yesterday I went through hell while testing the MySQL Proxy partition Lua scripts I am working on in a high concurrency environment. I am sending multiple queries to the server and build up a combined result set in read_query_result - something like this. The proxy returned weird results complaining about multiple result sets being sent from time to time at totally different places even though I was sure that for each query only one result set has been sent. And some of the results were just wrong, a lot of our tests failed unexpectedly.

After long hours flicking almost every switch I simply removed the Lua module declaration from the main LUA script passed to the proxy via proxy-lua-script since it was the only thing left that changed. And then everything worked just fine!

I guess defining the proxy Lua script as a module messes up the scope of the script, but I haven’t researched that yet. So for now I just don’t declare the main proxy Lua script as a module and everything works fine.

Progress on MySQL Proxy Partitioning

Saturday, March 29th, 2008

As posted here I started to think about possible ways to implement database sharding/partitioning.

I finally found the time to start prototyping a MySQL Proxy based solution that would allow you to analyze and rewrite queries to direct them to different databases. So this would going to be a nearly 100% transparent solution (some queries are impossible to support due to the nature of having multiple tables in different locations).

How does it work?
The main goal is to split up mysql tables and optionally put each of the resulting partitions on different mysql servers.

For now I concentrate on splitting up big tables into smaller ones within the same database. Distribution of these tables (i.e. partitions) over multiple databases would be the final goal and a lot more challenging task (think of transactions).

The work to be done would be divided into these 4 steps:

1. Analyze the query to find out which tables are involved and what the parition key would be (i.e. the value of the partition column or a hint - more on that later).
1.a. Validate the query and reject queries that cannot be analyzed (missing partition key etc.)

2. Determine the partition table / database. This could be done by a simple lookup, a hashing function or anything else.

3. Rewrite the query and replace the table names with the partition table names.

4. Execute the query on the correct database server and return the result back to the client.

An example:

Say this is the table you want to split up:

CREATE TABLE books (
    id INTEGER NOT NULL,
    name VARCHAR(100) NOT NULL,
    author INTEGER NOT NULL,
   
    PRIMARY KEY(id)
);

The partition tables for table books are named books_even and books_odd all with the same layout as books. The partition column is author so this value determines the partition table to be used. In this example we put all books of authors with an even id into books_even and the “odd ones” (meaning the “not even” ones not the strange ones ;) ) into books_odd.

Say the following query is sent to the proxy:

SELECT * FROM books WHERE author = 3;

The proxy would do the following:

1. Analyze the query and find that table books is used and that it is a partitioned table. We have defined that author is our partition column so “3” is our partition key. Both will be passed to the next part:

2. Lookup the partition table for books and partition key3” => books_odd.

3. Rewrite the query to:

SELECT * FROM books_odd WHERE author = 3;

4. Execute it and send the result to the client.

What is the status right now?
After getting warm with LUA and setting up a (unit) test environment, which was an easy task since MySQL Proxy already comes with a handy solution for that, I started to implement the query analyzer (step 1) and rewriter (step 3). Both utilize the tokenizer provided by the proxy and don’t rely on regular expressions because it would be to be too error prone in my opinion.

There is already a lot of code and tests ready and it turns out to work as intended. But still there is a lot of work and head ache ahead. My next goal is to run the test suite of one of our major products with about 2,000 database bound tests against the partitioner to see how this works out.

The analyzer is already able to parse various types of queries like:

SELECT * FROM books WHERE author = 3;
SELECT * FROM books AS b, prices WHERE b.author = 3;
INSERT INTO books (id, author) VALUES (1, 3);
DELETE FROM books WHERE author = 3;
UPDATE books SET name = ‘new’ AND author = 3;

# Hinting works like this
/* partitionKey(books) = ‘books_odd’ */ SELECT * FROM books;
/* skipPartition() */ SELECT ‘do not analyze me!’;

… and a lot more like queries including comments or joins or invalid queries (i.e. no partition key provided).

The rewriter is able to rewrite all of the queries above.

Actually, I think a good deal of the queries of the application I intended to test with would correctly be analyzed and rewritten right now. But there is still a lot of work to do on this side to handle more complex queries including subselects, functions etc.

What’s next?
As said above the next goal is to run the test suites of our applications utilizing the partitioner. Once this is point is reached I will post the prototype here.

Up to now all of this looks like it could really work for us so I hope there is going to be more than a prototype but we will have to see.

MySQL Partitioning on Application Side

Wednesday, March 26th, 2008

After following the scale up path (i.e. buy bigger boxes) for a while now, we definitely need to scale out as things start to become “unhandy” and simply HUGE. (See small things are better)

We are running an OLTP application with about 200 million transaction per month. Currently we have some very large tables with more than 1,5 billion rows and the total database size is about 600 GB - all InnoDB.

What we want to implement is application based sharding, i.e. splitting up the big tables and distribute them among many smaller servers. Furthermore we need to implement some sort of archiving mechanism since the db size is growing very fast. Putting old data into compressed myisam tables seems to be a good solution here.

So, how to implement this?

These solutions come into mind immediately:

  1. Using MySQL Cluster (this is not really application wise sharding but I mention it for completeness)
  2. Using MySQL Proxy with transparent query rewriting and load balancing
  3. Implement it into a JDBC driver
  4. Implement it into the application data access layer (DAOs)

Each solution comes with a list of pros and cons.

MySQL Cluster
Using the cluster would bring scalability right away but as we do not have any experience with this and had a lot of issues with the other storage engines (first crashes with MyISAM, then crashes with InnoDB, both being unsolved for a while even though the MySQL Enterprise Support did a great job!) I feel quite reluctant switching to a new storage engine. The hardware requirements would be too high too, so I don’t think that this could be the best solution.

Application Based Sharding
For all the other 3 types of partitioning I would use the following mechanism and meta tables.
Say this is a table I would like to split (Please note that this is just an example with some added redundancy (productCategory) to ease partitioning by product category.):

CREATE TABLE orders (
    orderId BIGINT(20) NOT NULL,

    productCategory VARCHAR(50) NOT NULL,
    productId BIGINT(20) NOT NULL,
    date DATETIME NOT NULL,

    PRIMARY KEY (orderId)
);

The column productCategory would be the partition key since there are only about 20 to 100 product categories used and the data is spread almost evenly. The real scenario would be different but this is a close and easy to grasp match.

The partition meta data table could look like this:

CREATE TABLE partition (
    sourceTable VARCHAR(255) NOT NULL, # would be "orders" in our example
    partitionKey VARCHAR(255) NOT NULL, # would be the value of "productCategory"
    dbHost VARCHAR(100) NOT NULL,
    dbPort VARCHAR(100) NOT NULL
    dbName VARCHAR(100) NOT NULL,
    dbUser VARCHAR(100) NOT NULL,
    dbPass VARCHAR(100) NOT NULL,
    dbTable VARCHAR(100) NOT NULL,

    PRIMARY KEY (sourceTable, partitionKey)
);

This way I could assign a different database connection to each product category or a set of categories. Furthermore we can have multiple tables of the same “type” within one database because the table name is configured, too (dbTable). So we could have the tables order_books, orders_hardware, orders_cds on server A and orders_dvds, orders_shirts on server B.

Limitations
This type of partitioning forces all queries to include a partition key (i.e. a productCategory) and IN clauses on the partition key are not possible or at least very hard to implement. Furthermore a full table scan (i.e. selecting rows based on the orderId column) requires to scan all tables. (This could be optimized if you have a correlation between orderId and productCategory in some way which is the case in our scenario - more on that later.).

MySQL Proxy and JDBC Driver
Yesterday I played a bit with the MySQL Proxy and yes it is hell of an awesome thing! I did some sort of query rewriting and load balancing which was easy at first hand.
The approach for both, MySQL Proxy and JDBC driver, would be the same: Rewrite the queries coming from the application by analyzing them and by using hints put into the SQL by the application (where analyzing does not help).

Assume the following query is issued to look up all orders of a product category:

SELECT * FROM orders WHERE productCategory=‘Books’;

After analyzing the query it is clear that we have to lookup the database configuration for the category ‘Books’, so we could rewrite that query (and send it to a different database connection if needed) like this:

SELECT * FROM orders_books WHERE productCategory=‘Books’;

We could even strip off the WHERE clause if there are only books stored in the table “orders_books”.

That was easy. What about more complex queries like (Attention: Bad database design ahead! It is just an example ;)):

SELECT
    orders.*,
    books.*
FROM
    orders,
    books
WHERE
    orders.productName = books.name

We know that the partition key has to be ‘Books’. But how can our query analyzer know that? It can’t. So there might be queries that are valid but cannot be analyzed. One way is to add a where condition AND orders.productCategory = 'Books', but that might not always be possible or wanted.
Another option would be to add hints like this:

SELECT
    orders.*,
    books.*
FROM
    orders /* HINT: productCategory=’Books’ */,
    books
WHERE
    orders.productName = books.name

I think implementing it as a MySQL Proxy LUA script or a JDBC driver would not make a big difference on the approach. I would prefer MySQL Proxy a lot though.

The pros of this approach are very clear:

  1. It is a transparent solution. Only a few parts of the application need to be changed in best case.
  2. It could be easily released as open source since it is not bound to any type of application - a huge plus IMHO!
  3. It keeps the application code simple.

The cons:

  1. Implementation is very challenging. Analyzing the queries would be the toughest part.
  2. Harder to test against the application since it is a runtime dependency. Would have to come up with a good integration test scenario.
  3. Performance could be an issue since the queries need to be parsed.
  4. Application would need to be changed anyway since we cannot issue some sorts of queries anymore. This could be small or a huge impact depending on the application.

Application Data Access Layer (DAO)
This would be the most straightforward way of implementing partitioning. Since we have total control over our application we could adopt the mechanism in any way to meet our needs. The basic concept would be to use a PartitionFactory like this:

public interface PartitionFactory {
 
    public Connection getConnection(String sourceTable, String partitionKey);
 
    public String getTable(String sourceTable, String partitionKey);
}

Within this solution we have to know which partition to use beforehand. Thus every DAO method has to know where to go. This could be done by adding the partition key to every method or rewriting methods to first find out which partition to use.
The path to this solution is clear and very straightforward, I think.

The pros:

  1. Easy to implement since we know what the application wants and does.
  2. Fast. No need to rewrite queries.
  3. Easier to grasp for (database) developers since they see the dependencies right in the code.

The cons:

  1. Proprietary solution unless embedded into other frameworks like Spring, Hibernate or iBatis.
  2. Might blowup the application code making it harder to understand.
  3. A lot of the application needs to be changed.

Summary
We are still in the process of evaluating every solution. Right now I would love to go for the MySQL Proxy solution be it just for its aesthetic image ;) We need to do a bit more prototyping to see if this would really be a good solution.

Any comment, discussion or hint is highly appreciated!

More to follow soon…

Beware: MySQL - Implicit Transaction Commit and Rollback

Thursday, March 15th, 2007

Today I had a hard time dealing with mysql and transactions. Somehow my transaction has been committed in the middle of the statements. A short look into the manual revealed that there are SQL command that implicit commit a transaction. While I do understand that CREATE TABLE commits a transaction, I was surprised that LOAD DATA INFILE does it too!

And since I use the nice REPLACE INTO SELECT ... workaround found on mysqlperformanceblog (Hint: Always a great resource!) I had to upgrade my development mysql server because since version 5.0.28 LOAD DATA INFILE does not commit a transaction.

MySQL manual

Where to start mysql performance tuning?

Thursday, February 22nd, 2007

I recently found a great article on mysql performance blog giving you a short and precise description of what to tune in the first place.

This is a must!