Archive for September, 2008

SHOW STATUS considered harmful

Thursday, September 11th, 2008

First of all, I know this is a known problem, but it struck me so hard, I just had to write about it!

As Peter Zaitev points out calling SHOW STATUS might have a huge performance impact.

We recently replaced one of our servers with a DELL R900 with 96GB RAM. Having a disk-bound workload and > 1TB worth of data in InnoDB we expected a noticeable performance gain compared to the former server with 32GB. The new server even has better RAID and HDD.

But that was not the case. Things got even worse! A lot of queries “hang”, server load peaked at almost 7 and we saw a lot of cpu activity. Just before I started a deep analysis of what is going on inside I spotted a SHOW GLOBAL STATUS which ran every second. DOH!

Where did it came from? An administrator was running MySQL Administrator and used the Health chart to monitor some variables. So it periodically sent SHOW GLOBAL STATUS to the server. That resulted in a lot of queries waiting for the buffer pool (look at Peter’s post and the comments to understand why). And things get worse with bigger InnoDB buffer pool (this particular mysql instance uses 70GB!).

MySQL Administrator (the tool, not the person ;) ) shut down - everything just looks great now!

LuaSQL fetches results about 15% faster than MySQL Proxy?

Sunday, September 7th, 2008

While evaluating LuaSQL as backend connection replacement I came across this. I did a quick performance test using mysqlslap and it showed that just reading and copying the result can be significantly faster with LuaSQL.

Benchmark details

What I did was just sending the query to the backend and building up a new result-set in LUA.

This is the code for LuaSQL:

require("luasql.mysql")
local _sqlEnv = assert(luasql.mysql())
local _con = nil

function read_auth(auth)
    local host, port = string.match(proxy.backends[1].address, "(.*):(.*)")
    – We explicitly connect to db "test" since mysqlslap drops the database
    – and LuaSQL needs the db to exists beforehand. Anyway this is just a
    – quick tests, so don’t bother.
    _con = assert(_sqlEnv:connect("test", auth.username, auth.password, host, port))
end

function disconnect_client()
    assert(_con:close())
end

function read_query(packet)
    if (packet:byte() == proxy.COM_QUERY) then
        local query = packet:sub(2)
        local result = nil
        local cur = assert(_con:execute(query))
        if (type(cur) == "number") then
            proxy.response.type = proxy.MYSQLD_PACKET_RAW;
            proxy.response.packets = {
                "\000" .. – fields
                string.char(cur) ..
                "\000" – insert_id
            }
            result = proxy.PROXY_SEND_RESULT
        else
            – Build up the result set.
            local fields = {}
            local colNames = cur:getcolnames()
            local colTypes = cur:getcoltypes()
            for a = 1, #colNames, 1 do
                table.insert(fields, {name = colNames[a], type=proxy.MYSQL_TYPE_STRING})
            end
            local curRow = {}
            local rows = {}
            while (cur:fetch(curRow)) do
                table.insert(rows, curRow)
            end
            proxy.response = {
                type = proxy.MYSQLD_PACKET_OK,
                resultset = {
                    fields = fields,
                    rows = rows
                }
            }
            result = proxy.PROXY_SEND_RESULT
        end

        if (result ~= nil) then
            return result
        end
    end
end

And this is the code for using MySQL Proxy only:

function read_query(packet)
    if (packet:byte() == proxy.COM_QUERY) then
        local query = packet:sub(2)
        – We append the query so read_query_result gets triggered.
        proxy.queries:append(1, string.char(proxy.COM_QUERY) .. query)
        return proxy.PROXY_SEND_QUERY
    end
end

function _read_query_result(inj)
    local resultSet = assert(inj.resultset)
    local newFields = nil
    local fieldCount = 1
    local fields = resultSet.fields
    if (fields) then
        newFields = {}
        while fields[fieldCount] do
            table.insert(
                newFields,
                {
                    type = fields[fieldCount].type,
                    name = fields[fieldCount].name
                }
            )
            fieldCount = fieldCount + 1
        end
    end

    local newRows = nil
    if (resultSet.rows) then
        newRows = {}
        for row in resultSet.rows do
            table.insert(newRows, row)
        end
    end

    if (newFields) then
        proxy.response = {
            type = proxy.MYSQLD_PACKET_OK,
            resultset = {
                fields = newFields,
                rows = newRows
            }
        }
        return proxy.PROXY_SEND_RESULT
    end
end

As you can see we do nothing but copy the result-set in LUA. This mimics the result-set aggregation HSCALE does if a full partition scan is necessary.

Results

Using LuaSQL:

> $ mysqlslap -h 127.0.0.1 -P 4040 –auto-generate-sql –number-of-queries=10000
Benchmark
        Average number of seconds to run all queries: 65.731 seconds
        Minimum number of seconds to run all queries: 65.731 seconds
        Maximum number of seconds to run all queries: 65.731 seconds
        Number of clients running queries: 1
        Average number of queries per client: 10000

Using MySQL Proxy only:

> $ mysqlslap -h 127.0.0.1 -P 4040 –auto-generate-sql –number-of-queries=10000
Benchmark
        Average number of seconds to run all queries: 74.607 seconds
        Minimum number of seconds to run all queries: 74.607 seconds
        Maximum number of seconds to run all queries: 74.607 seconds
        Number of clients running queries: 1
        Average number of queries per client: 10000

For comparison: Using empty read_query and read_query_result functions:

> $ mysqlslap -h 127.0.0.1 -P 4040 –auto-generate-sql –number-of-queries=10000
Benchmark
        Average number of seconds to run all queries: 39.657 seconds
        Minimum number of seconds to run all queries: 39.657 seconds
        Maximum number of seconds to run all queries: 39.657 seconds
        Number of clients running queries: 1
        Average number of queries per client: 10000

Versions used: MySQL Proxy 0.7.0 (svn-rev 511), LuaSQL 2.1.1, MySQL server 5.0.51a, mysqlslap 5.1.26rc

Of course I repeated the tests several times to verify the results.

Without digging too deep into the source of both MySQL Proxy and LuaSQL the biggest difference is that LuaSQL pushes the result-set row-by-row onto the LUA-stack whereas MySQL Proxy puts the whole result.

Update: As Jan points out below this is not true. MySQL Proxy puts the result row by row onto th LUA stack, too.

Update #2: The tests above ran on my Notebook (MacBookPro 2.4GHz, 4GB RAM running Ubuntu 8.04 64 bit). They are reproducible. Running the same tests on an 8-core-server putting the MySQL database on another server results in the MySQL Proxy version running slightly faster (about 2-5%) than the LuaSQL version.

Conclusions

Even though this tiny benchmark showed that the speed of LuaSQL seems to be feasible, there are still drawbacks. First of all: Depending on your workload only a fraction of your queries need result-set altering. Namely it’s only full partition scans that need this. Most of the time you just need to change the table name or the backend. And then LuaSQL is 100% slower than MySQL Proxy alone.

Another downside of LuaSQL is that it does not return the mysql field types but only the LUA types. This makes it impossible to build up a correct result-set that can be sent back to the client.

So still we need suitable (for HSCALE) backend connection handling in MySQL Proxy if we want higher performance.

Built-in result-set merging would be a big win, too. Then we could even have streaming combined result-sets taking the memory pressure from the proxy (since every result-set has to be fully loaded into memory).

That said I think about using LuaSQL for configuration handling since it is a lot easier than doing it via proxy.queries:append -> read_query_result -> proxy.queries:append -> ....

Version 0.3 of HSCALE is almost in the door

Saturday, September 6th, 2008

After working on build and test improvements (for example incorporating lualint and LuaCov) as well as other lua “side-projects” (i.e. Log4LUA) we are running towards HSCALE 0.3.

The focus of the forthcoming version 0.3 of HSCALE is Dictionary Based Partition Lookup. Using this partition lookup module lets you take full control over how your partitions are created and where they are actually located.

Update: Dictionary Based Partition Lookup is fully implemented. See this blog post and the wiki page about it.

Please note: Due to the problems with backend connection handling version 0.3 will still focus on single server backends. Even though support for multiple backends is already implemented in HSCALE. Please look at the proxyUnit tests for a glimpse on how multi-server backends will be handled.

Please check out the current development snapshot at svn.hscale.org to see what is already there. Currently the partition lookup is fully functional but the administrative commands and some further hashing functions are missing (see below).

How does dictionary partition lookup work?

As the name implies partitions are looked up in a dictionary. The dictionary itself is stored in the main database and cached internally. So now you can freely move partitions around and create new ones.

What is done internally is this:

  1. Apply a hashing function (read further) to the value.
  2. Lookup the partition based on the hashed value.
  3. If no partition has been found use the default partition created for every partitioned table.
  4. Return the partition (and assigned backend)

Hashing functions

To reduce the number of partitions to be created and the overall administration overhead, a hashing function may be applied to the partition value before the partition is looked up.
Currently there are 3 hashing functions available:

  • MOD(X): A modulus function grouping X partition values together. Works only for numbers of course. If you have *lots* of different partition values with a smaller number of rows each, then it might be better to group them together instead of creating *lots* of partitions.
    Example: Using MOD(3) the values 1, 4 and 7 will end up in the same partition.
  • PREFIX(length) Partition values are grouped together by the first length characters. Works on everything (everything is treated as string).
    Example: Using PREFIX(3) the values “foo“, “foobar” and “footaliciuos” will end up in the same partition.
  • NONE(): This function does … nothing! Use it if you really want a 1:1 relationship between partition values and partitions.

Further hashing functions are planned (and might make it to version 0.3):

  • DIV(X): As opposite to MOD(X) this function divides the partition value by X. So this like a fixed range function. While MOD(X) creates at most X partitions DIV(X) creates infinite number of partitions.
  • DATE(pattern): Enables date-range based partitions.
    Example: Using DATE("%Y-%m") will group by (year-)month.

Administrative commands

Because handling of partitions is a delicate thing, creation and maintenance of partitions should not be left to some obscure SQL-statements. Therefor the dictionary partition lookup will provide administrative commands that try to avoid mis-configuration. It will be checked whether partitions overlap etc.

Because this is still work in progress the commands might change.

Table setup

First of all your table has to be set up:

HSCALE SETUP_TABLE(‘[table]’, ‘[column]’, ‘[default table]’, [backend], ‘[hashing function]’)

# Example
HSCALE SETUP_TABLE(‘users, ‘nickname‘, ‘users_default‘, 1, ‘PREFIX(3)‘)

The table has to exist before and will be renamed to the default table name ('users_default' in our example). SETUP_TABLE can only be called once per table.

Add partitions

HSCALE ADD_PARTITION(‘[table]’, ‘[partition name]’, ‘[partition value]’, [backend])

# Example
HSCALE ADD_PARTITION(‘users’, ‘nick1′, ‘mar’, 1)

The example above creates a partition with the name 'nick1' for table 'users' and partition value 'mar' (users with nickname 'marvel', 'martin' etc.). The partition name will directly reflect to the table name of the partition. So the table for this partition will be users_nick1. Usually you would use the partition value ('mar') as partition name but you don’t have to. You are able store multiple partitions in the same table. This sounds strange but it makes sense to define a finer partitioning scheme upfront and actually use a wider scheme until you really need to split up data. This makes it a lot easier to split things up afterwards.

Moving partition data

In version 0.3 partition data will not be moved to a newly created partition. You will have to do it by hand. This will be implemented as soon as multiple backends are fully supported because that implies a different approach since data has to be moved between different servers then. An administrative command to move partitions will also be available then (HSCALE MOVE_PARTITION(...)).

Multiple instances of HSCALE working on the same data

HSCALE is designed to support multiple instances of it running in parallel mostly to avoid to be the bottleneck and single point of failure. Every instance of HSCALE periodically refreshes the internal partition configuration to reflect changes made by another instance. In the case of creating and moving partitions, partitions will be locked inside HSCALE so all clients will wait until the operation finishes. This guarantees data integrity.

Version 0.3 will be released within the next few weeks but definitely after MySQL Proxy 0.7 has been released so we can thoroughly test it against the newest version.

Please feel free to discuss certain features and design decisions shown above. Any feedback is welcome!

Log4LUA version 0.2 and project page

Saturday, September 6th, 2008

Everything is available at the project page

Please report issues and feature request here.

Version 0.2 is a bug fix release but with 2 significant changes in syntax:

  1. Changed all constructor methods from create(...) to new. Seems to be more common in the LUA world.
  2. The logger class is now returned by the module. So it is
    local logger = require("optivo.common.log4lua.logger")
    local LOG = logger.new()

    instead of

    local logger = require("optivo.common.log4lua.logger")
    local LOG = logger.Logger.create()

Some potential bugs have been spotted using (a slightly adopted version of) lualint and LuaCov.

With release of version 0.2 the Log4LUA API is declared as stable. Future versions might change the syntax but are guaranteed to be backwards compatible.

More fun with LUA - Introducing Log4LUA

Tuesday, September 2nd, 2008

UPDATE Log4LUA 0.2 released. Go to the project page.

After the dust about backend connection handling in MySQL Proxy had settled, I begun working on other parts of HSCALE again. (I’ll return to the backend handling later after talking to Jan Kneschke about their plans.)

One of the issues that bothered me the most was logging. Until now I just had used a simple debug function that could be enabled or disabled using an environment variable. The pretty straightforward approach a lot of people use. But there is more to logging than to print debug messages. So I searched the web and of course I found LuaLogging. It is more flexible, has different levels (DEBUG, INFO, WARN, ERROR, FATAL) and appenders. You can print your log messages to the screen, write them to a file or send an email.

After working with other logging frameworks in other languages (mostly Log4X like Log4J) I got used to a number of features missing in LuaLogging:

  • There are no means of external configuration. The logger is configured inside the code. You cannot use different configurations for development and production out of the box. This is crucial in my opinion, since during development it is handy to enable all log levels and print everything on screen, while in production you want to log to a file and disable at least the DEBUG level.
  • You get no information where the log event came from. Log messages are more meaningful if you know where the log message has been created, i.e. in which source file and line position. This makes it easier to track down bugs and errors.
  • There is no log category concept. Having different log categories makes it easier to group log messages either by source or by meaning. As an example you could have one log category called “core” where all core messages go to and another “connection” where all connection related log messages go to.
  • You cannot use multiple appenders. In production you might want to log all messages to a file and get an email for all errors.

Since adding all this functionality to LuaLogging would in fact mean to re-write it, I wrote yet another logging facility and called it: Log4LUA. I choose the name because this package behaves almost like Log4X.

Download Log4LUA

A detailed documentation on how to use it can be found here (luadoc).

Features

  • External configuration. Configure your logging system via a config file.
  • Logger categories Configure different categories for different logging tasks.
  • Detailed information available: source file, line, function or the whole stack trace
  • Console, file and smtp (email) appenders
  • Multiple appenders per category Log to a file and get the worst errors by email
  • Level threshold for smtp appender. Don’t get every log message by mail only the ones above a defined level.
  • Log file rotations for file appender. Based on a date pattern

Basic usage

First write a configuration file (say log4lua.conf.lua):

local logger = require("optivo.common.log4lua.logger")
local console = require("optivo.common.log4lua.appenders.console")
local file = require("optivo.common.log4lua.appenders.file")
local smtp = require("optivo.common.log4lua.appenders.smtp")

local config = {}

– Create a default smtp appender sending message of level WARN or higher.
local defaultSmtp = smtp.new(
    {
        headers = {
            from = "myserver@mydomain.com",
            to = "admin@mydomain.com",
            subject = "%LEVEL: %MESSAGE"
        },
        body = "Hi, an error occurred at %FILE:%LINE.\n\nLevel: %LEVEL\nMessage: %MESSAGE\n"
    },
    "mail.mydomain.com",
   logger.WARN
)

– ROOT category must be configured.
– For the root category we use console and smtp appender
config["ROOT"] = logger.Logger.new({console.new(), defaultSmtp}, "ROOT", logger.INFO)

– Category "core" uses rotating file appender and the default message pattern
config["core"] = logger.Logger.new(file.new("core-%s.log", "%Y-%m-%d"), "core", logger.INFO)

– The config table must be returned.
return config

In your lua source (say mycode.lua) use the following to get a logger:

local logger = require("optivo.common.log4lua.logger")

– I would prefer calling the main logger LOG.
– This will return a logger for category "ROOT" since there is no category "mycode".
local LOG = logger.getLogger("mycode")

– This will return a logger of the category "core" since the category name "core.moduleA" starts with "core"
local LOG_CORE = logger.getLogger("core.moduleA")

– Log something
LOG:info("My first log message")

– Tables are converted to string for you.
LOG:warn({name = "Paul", age = "22"})

– Log to different category.
LOG_CORE:fatal("System error")

Start your program with:

LOG4LUA_CONFIG_FILE = "log4lua.conf.lua" lua mycode.lua

For more examples look at the HSCALE sources. Log4LUA is used everywhere.

Please feel free to comment on this. Or send usage or bug reports. I will setup a project page later on.