pero on anything

MySQL Partitioning on Application Side

After following the scale up path (i.e. buy bigger boxes) for a while now, we definitely need to scale out as things start to become “unhandy” and simply HUGE. (See small things are better)

We are running an OLTP application with about 200 million transaction per month. Currently we have some very large tables with more than 1,5 billion rows and the total database size is about 600 GB – all InnoDB.

What we want to implement is application based sharding, i.e. splitting up the big tables and distribute them among many smaller servers. Furthermore we need to implement some sort of archiving mechanism since the db size is growing very fast. Putting old data into compressed myisam tables seems to be a good solution here.

So, how to implement this?

These solutions come into mind immediately:

  1. Using MySQL Cluster (this is not really application wise sharding but I mention it for completeness)
  2. Using MySQL Proxy with transparent query rewriting and load balancing
  3. Implement it into a JDBC driver
  4. Implement it into the application data access layer (DAOs)

Each solution comes with a list of pros and cons.

MySQL Cluster
Using the cluster would bring scalability right away but as we do not have any experience with this and had a lot of issues with the other storage engines (first crashes with MyISAM, then crashes with InnoDB, both being unsolved for a while even though the MySQL Enterprise Support did a great job!) I feel quite reluctant switching to a new storage engine. The hardware requirements would be too high too, so I don’t think that this could be the best solution.

Application Based Sharding
For all the other 3 types of partitioning I would use the following mechanism and meta tables.
Say this is a table I would like to split (Please note that this is just an example with some added redundancy (productCategory) to ease partitioning by product category.):

CREATE TABLE orders (
orderId BIGINT(20) NOT NULL,

productCategory VARCHAR(50) NOT NULL,
productId BIGINT(20) NOT NULL,
date DATETIME NOT NULL,

PRIMARY KEY (orderId)
);

The column productCategory would be the partition key since there are only about 20 to 100 product categories used and the data is spread almost evenly. The real scenario would be different but this is a close and easy to grasp match.

The partition meta data table could look like this:

CREATE TABLE partition (
sourceTable VARCHAR(255) NOT NULL, # would be "orders" in our example
partitionKey VARCHAR(255) NOT NULL, # would be the value of "productCategory"
dbHost VARCHAR(100) NOT NULL,
dbPort VARCHAR(100) NOT NULL,
dbName VARCHAR(100) NOT NULL,
dbUser VARCHAR(100) NOT NULL,
dbPass VARCHAR(100) NOT NULL,
dbTable VARCHAR(100) NOT NULL,

PRIMARY KEY (sourceTable, partitionKey)
);

This way I could assign a different database connection to each product category or a set of categories. Furthermore we can have multiple tables of the same “type” within one database because the table name is configured, too (dbTable). So we could have the tables order_books, orders_hardware, orders_cds on server A and orders_dvds, orders_shirts on server B.

Limitations
This type of partitioning forces all queries to include a partition key (i.e. a productCategory) and IN clauses on the partition key are not possible or at least very hard to implement. Furthermore a full table scan (i.e. selecting rows based on the orderId column) requires to scan all tables. (This could be optimized if you have a correlation between orderId and productCategory in some way which is the case in our scenario – more on that later.).

MySQL Proxy and JDBC Driver
Yesterday I played a bit with the MySQL Proxy and yes it is hell of an awesome thing! I did some sort of query rewriting and load balancing which was easy at first hand.
The approach for both, MySQL Proxy and JDBC driver, would be the same: Rewrite the queries coming from the application by analyzing them and by using hints put into the SQL by the application (where analyzing does not help).

Assume the following query is issued to look up all orders of a product category:

SELECT * FROM orders WHERE productCategory='Books';

After analyzing the query it is clear that we have to lookup the database configuration for the category ‘Books’, so we could rewrite that query (and send it to a different database connection if needed) like this:

SELECT * FROM orders_books WHERE productCategory='Books';

We could even strip off the WHERE clause if there are only books stored in the table “orders_books”.

That was easy. What about more complex queries like (Attention: Bad database design ahead! It is just an example ;)):

SELECT
orders.*,
books.*
FROM
orders,
books
WHERE
orders.productName = books.name

We know that the partition key has to be ‘Books’. But how can our query analyzer know that? It can’t. So there might be queries that are valid but cannot be analyzed. One way is to add a where condition AND orders.productCategory = 'Books', but that might not always be possible or wanted.
Another option would be to add hints like this:

SELECT
orders.*,
books.*
FROM
orders /* HINT: productCategory='Books' */,
books
WHERE
orders.productName = books.name

I think implementing it as a MySQL Proxy LUA script or a JDBC driver would not make a big difference on the approach. I would prefer MySQL Proxy a lot though.

The pros of this approach are very clear:

  1. It is a transparent solution. Only a few parts of the application need to be changed in best case.
  2. It could be easily released as open source since it is not bound to any type of application – a huge plus IMHO!
  3. It keeps the application code simple.

The cons:

  1. Implementation is very challenging. Analyzing the queries would be the toughest part.
  2. Harder to test against the application since it is a runtime dependency. Would have to come up with a good integration test scenario.
  3. Performance could be an issue since the queries need to be parsed.
  4. Application would need to be changed anyway since we cannot issue some sorts of queries anymore. This could be small or a huge impact depending on the application.

Application Data Access Layer (DAO)
This would be the most straightforward way of implementing partitioning. Since we have total control over our application we could adopt the mechanism in any way to meet our needs. The basic concept would be to use a PartitionFactory like this:

public interface PartitionFactory {

public Connection getConnection(String sourceTable, String partitionKey);

public String getTable(String sourceTable, String partitionKey);
}

Within this solution we have to know which partition to use beforehand. Thus every DAO method has to know where to go. This could be done by adding the partition key to every method or rewriting methods to first find out which partition to use.
The path to this solution is clear and very straightforward, I think.

The pros:

  1. Easy to implement since we know what the application wants and does.
  2. Fast. No need to rewrite queries.
  3. Easier to grasp for (database) developers since they see the dependencies right in the code.

The cons:

  1. Proprietary solution unless embedded into other frameworks like Spring, Hibernate or iBatis.
  2. Might blowup the application code making it harder to understand.
  3. A lot of the application needs to be changed.

Summary
We are still in the process of evaluating every solution. Right now I would love to go for the MySQL Proxy solution be it just for its aesthetic image ;) We need to do a bit more prototyping to see if this would really be a good solution.

Any comment, discussion or hint is highly appreciated!

More to follow soon…

14 thoughts on “MySQL Partitioning on Application Side

  • Keith Murphy says:

    I would really recommend that you pick up a copy of the second edition of High Performance MySQL as soon as it comes out. They spend a good portion of the book talking about these issues. It was suppose to be released at the Users Conference but I understand it has been pushed back a bit. You might not have time to wait, but if you are able to wait you won’t be sorry!!

    In particular, the application partitioning information will be very relevant.

  • Jan Kneschke says:

    To filter out the comment you can use the integrated tokenizer and look for TK_COMMENT as token-type. Parsing the comment afterwards should be easy.

    Doesn’t this sound like a Google Summer of Code project ? :)

  • pero says:

    Keith, I read the first edition of High Performance MySQL and will definitely read the second edition, too. Do you know anything particular to look after?

  • pero says:

    Jan, I just read about the Tokenizer – great stuff. This would save a lot of work.

  • Jay Pipes says:

    Just nit-picking, but this:

    CREATE TABLE orders (
    orderId BIGINT(20) NOT NULL,

    productCategory VARCHAR(50),
    productName VARCHAR(255),
    productPrice FLOAT,

    PRIMARY KEY (orderId)
    );

    is not a normalized schema. Why wouldn’t you have a productID that was stored in the Products table with the productName, productPrice, productCategory, etc? Why is this information stored in the Orders table as redundant data?

    Cheers,

    Jay

  • pero says:

    Jay, yes it is some kind of nit-picking ;) It is just an example.

    OTOH some kind of redundant information (like having the product category in the orders – name and price are still just for illustration) would make partitioning a lot easier. (Using a categoryId that refers to a category table would make more sence).

    The grade of redundancy depends on the type of data and the type of queries you want to perform against your data. If 90% of your queries on orders involve a productCategory it might make sense to store that information along with the order. Actually, I use this kind of redundancy in our application for almost all of the huge data sets. Otherwise certain queries would be impossible to run because of the JOIN size/overhead or at least would be painfully slow. And assuming that a productCategory cannot change after an order has been placed makes redundancy less evil. The latter assumption depends on your application and data design of course.

    Anyway, normalization is a good thing and should be enforced where possible but sometimes you have to add redundancy.

    Cheers,

    Peter

  • Peter,

    I find a number of fundamental data architecture in-efficiencies in your schema. You really should invest in an MySQL expert to assist in the most optimal achitecture for large systems.

    At worst, be sure to goto “The top 20 design tips for data architects” at the MySQL conference, you will learn a few things not to do!

    Regards

    Ronald Bradford
    http://www.primebase.org
    PBXT – The Community Storage Engine

  • pero says:

    Ronald,

    as said above: THIS IS JUST AN EXAMPLE with A SINGLE TABLE (the table “partition” is introduced as part of the solutions)! It does not reflect any aspect of any of the projects we are working on and it is indeed nothing one should apply to real life applications (as said in my previous comment).

    I just thought it would help illustrate the possible partitioning solutions which I wanted to discuss in the first place.

    Greetings,

    Peter

    [Edit] I cleaned up the example a bit and only left the redundancy on product category. [Edit]

  • Xaprb says:

    Pero, just a quick note — MySQL Cluster is really a very special-case product and you are going the right route with application partitioning instead. Also, reading through your other comments and responses, I think the other commenters are sort of taking for granted that this is an example, and just trying to add helpful extra info :-)

    I agree with Ronald — help from someone who’s done this before would be worth its weight in gold. Having just walked a company through it for the first time, I can say there is a lot more to it than it seems like.

    Archiving is also a good idea — but it needs to be both/and, not either/or. Maatkit has a handy archiving tool in it.

  • Marco says:

    I too am intrigued by the MySQL Proxy option. Aren’t you concerned about it’s “alpha” status?! I’ve waited many months for it to progress at least to beta; my management will think I’ve lost it if I propose that our data flow through an alpha release. (I realize it’s all relative; Senior managers do not)

    cheers,
    Marco

  • pero says:

    Marco,

    MySQL Proxy is already very stable (partly due to the expertise Jan has gathered with developing lighttpd which architecture seems to be quite similar) and is maturing at a high speed in my opinion. I remember the time when you could have used a MySQL BETA version that would for sure much better than other products already claiming a “Service Pack” or two. These times seem to be gone for the server alone (but the quality is still very good even in early releases). I’m working with the proxy for a couple of weeks now and everything works like a charm.

    At the end I don’t rely on the labeled version number or alpha status but on what our tests – especially unit-, concurrency-, load- and performance tests – say. And all looks good right now, we didn’t have a single problem related to the proxy.

    As for what a manager would say: I’m in the comfortable (or uncomfortable if it goes terribly wrong) situation to be the executive “manager”. ;)

  • [...] written here and here I’ve been working on a MySQL Proxy Lua module that transparently splits up tables [...]

  • JDBC Books says:

    Hubs of MySQL Partitioning on Application Side

    hubs about JDBC Books to books WHERE orders.productName = books.name. I think implementing it as a MySQL Proxy LUA script or a JDBC driver would not make a big difference on the approach. I would prefer MySQL Proxy a lot though. …

  • Is MySQL-partitioning useful for very big real-life-problems?

    Some months ago I helped out in another project in which they had some performance problems. They had a very big table and the index of the table was bigger than the table itself. As every change in the table causes MySQL to recalculate/reload the index,

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>