pero on anything

java

Automated performance degradation tests with JUnit4

There are several extension to JUnit that provide means to test performance like JUnitPerf or p-unit. But it is hard to formulate the right assertions. What if the test runs on a beefier machine or in another environment? Did performance degrade? I just want to answer a simple question: Did performance degrade? (And if, when?) [...]

Integrating MySQL and Hadoop – or – A different approach on using CSV files in MySQL

We use both MySQL and Hadoop a lot. If you utilize each system to its strengths then this is a powerful combination. One problem we are constantly facing is to make data extracted from our Hadoop cluster available in MySQL. The problem Look at this simple example: Let’s say we have a table customer: CREATE [...]

MySQL Connector/J randomly hanging at com.mysql.jdbc.util.ReadAheadInputStream.fill

In the past months we struggled with large SELECT queries just get stuck at: java.net.SocketInputStream.socketRead0(Native Method) java.net.SocketInputStream.read(SocketInputStream.java:129) com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:113) com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:160) com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:188) – locked com.mysql.jdbc.util.ReadAheadInputStream@cb9a81c com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:2494) com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2949) com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2938) com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3481) com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1959) com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2109) com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2642) – locked java.lang.Object@70cbccca com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2571) com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:782) – locked java.lang.Object@70cbccca com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:625) org.apache.commons.dbcp.DelegatingStatement.execute(DelegatingStatement.java:260) org.apache.commons.dbcp.DelegatingStatement.execute(DelegatingStatement.java:260) Whenever this happened we just restarted the Tomcat server and everything was fine again [...]

Improve performance on small hadoop clusters

Hadoop is designed to run on huge clusters containing several hundred machines. But some people just don’t need such a big cluster and are able to use the benefits of HDFS and MapReduce on a smaller scale. We managed to improve performance of our 10-node-test-cluster by almost 100% by adjusting the heartbeat intervals. Namenode and [...]

Simulating indexes in Hadoop

You should not try to use Hadoop as a “drop-in” replacement of your current (R)DBMS. That said it is still possible to utilize the power of cluster computing while circumventing its weaknesses when it comes to ad-hoc or real-time queries. We use Hadoop as an on-line system tightly integrated with our application and use it [...]

Increasing Performance of Hadoop-Unit-Tests

Adding a lot of unit tests for our application that uses Hadoop and its Map-Reduce-Engine significantly increased integration build time. Hadoop comes with a LocalJobRunner which is used by default so you do not have to set up a complete cluster in order to run some Unit-Tests. This is great! But the problem is: it [...]