Remove duplications and fix bad names: February 2013

Wednesday, 27 February 2013

Easy testing SLAs on distributed components with grep4j

So your distributed architecture looks something like the picture below and you just have received a requirement from the business to make sure that the SLAs of the messages sent by the Producer and then traveling to the downstream systems (consumers) must be fast and never slower than 400 milliseconds.

Requirement says :
The Latency of a message sent from the Producer to any of the Consumers should be never slower than 400 milliseconds.

Sounds familiar? To me yes, and experience taught me that if I want to protect the SLAs in the future, I need as well to automate the test in order to not introduce bottlenecks that then increase the latency of the messages.

But how to do it? Producer and Consumers are in separate machines and some of the consumers are not written in Java.
Also, between the producer and the consumers there is a Queue (or web-service or RMI or an ESB or another component or whatever), so things are not getting easier for me to test.

Well, all components write logs in a similar way, so why not using logs as a data to test?

For example these are 2 sample logs, one from the producer firing a message (id 1546366) and the other from one of the consumer receiving the message (id 1546366):

Producer logs

......

2013-02-19 10:09:05,795 INFO [org.grep4j.demo.input.core.InputCoreMessageSender] (pool-19-thread-9) Input::MessageSender::Message(1546366) Sent Successfully

......

Consumer logs

.....

2013-02-19 10:09:06,161 INFO [org.grep4j.demo.handler.bean.mdb.SingleDestPacketHandler] (Thread-62314 (HornetQ-client-global-threads-989457197)) Handler::Packet::Message(1546366) Received::PacketId(982336426)::State(NORMAL)::Dest(CONSUMER4, 1)::DataLevel(EVENT, 7)::Op(CREATE, C)::GlobalId(1546366)::Priority(1)::Src(GUI, 1::Ids(EventId=1546366,SFBId=1546366,isBirType=false,inBir=false))

.....

And this is how my automated performance test looks like using Grep4j :

package com.gdg.grep4j.demo;
import static com.gdg.grep4j.demo.profiles.Profiles.consumer1;
import static com.gdg.grep4j.demo.profiles.Profiles.consumer2;
import static com.gdg.grep4j.demo.profiles.Profiles.consumer3;
import static com.gdg.grep4j.demo.profiles.Profiles.producer;
import static com.gdg.grep4j.demo.services.TimeService.extractTime;
import static org.grep4j.core.Grep4j.constantExpression;
import static org.grep4j.core.Grep4j.grep;
import static org.grep4j.core.fluent.Dictionary.on;
import static org.hamcrest.CoreMatchers.is;
import static org.hamcrest.number.OrderingComparison.lessThan;
import static org.junit.Assert.assertThat;
import org.grep4j.core.result.GrepResults;
import org.testng.annotations.BeforeTest;
import org.testng.annotations.Test;
@Test
public class MessageDistributionPerformanceTest {


        private static final long MAX_ACCETABLE_LATENCY = 400L;

        private long producerTime = 0;

        private GrepResults consumersResults;
        
        @BeforeTest
        public void triggerMessageDispatcher() {
                System.out.println("Producing and firing a Message(1546366) to downstream systems...");
        }
        
        @BeforeTest
        public void extractProducerTime() {
                
                GrepResults producerResult = grep(constantExpression("Message(1546366) Sent Successfully"),     on(producer));
                producerTime = extractTime(producerResult.toString());
        }
        
        @BeforeTest
        public void grepConsumerLogs() {
                                
                consumersResults = grep(constantExpression("Message(1546366) Received"),
                                on(consumer1, consumer2, consumer3));
        }

        public void testConsumer1Latency() {
 
                long consumer1Time = extractTime(consumersResults.filterOnProfile(consumer1).toString());
                assertThat((consumer1Time - producerTime),
                                is(lessThan(MAX_ACCETABLE_LATENCY)));
        }

        public void testConsumer2Latency() {

                long consumer2Time = extractTime(consumersResults.filterOnProfile(consumer2).toString());
                assertThat((consumer2Time - producerTime),
                                is(lessThan(MAX_ACCETABLE_LATENCY)));

        }

        public void testConsumer3Latency() {

                long consumer3Time = extractTime(consumersResults.filterOnProfile(consumer3).toString());
                assertThat((consumer3Time - producerTime),
                                is(lessThan(MAX_ACCETABLE_LATENCY)));

        }
}

A profile is the grep target context, in my case all the profiles are remote machines (for a better understanding of Profiles see Grep4j page).

TimeService is just a simple service class extracting the time in the logs.

package com.gdg.grep4j.demo.services;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.TimeZone;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class TimeService {

        private static final Pattern timePattern = Pattern
                        .compile("([0-9][0-9][0-9][0-9])-([0-9][0-9])-([0-9][0-9]) ([0-9][0-9]|2[0-3]):([0-9][0-9]):([0-9][0-9]),([0-9][0-9][0-9])");

        public static long extractTime(String text) {
                Matcher lm = timePattern.matcher(text);
                if (lm.find()) {
                        SimpleDateFormat sdf = new SimpleDateFormat(
                                        "yyyy-MM-dd HH:mm:ss,SSS");
                        sdf.setTimeZone(TimeZone.getTimeZone("UTC"));

                        String inputString = lm.group();

                        Date date = null;
                        try {
                                date = sdf.parse(inputString);
                        } catch (ParseException e) {
                                e.printStackTrace();
                        }
                        return date.getTime();

                } else {
                        throw new IllegalArgumentException("timePattern not found");
                }
        }
}

In few simple lines of code I have my extremely flexible test (I can test anything that was produced in the logs) .

For the full code : https://github.com/marcocast/grep4j-gdg.git

Monday, 25 February 2013

9 Must read books for developers by Uncle Bob Martin

Last week Uncle Bob Martin gave a speech in my company about Components and Architecture.
Needless to say it was a great pleasure and fun to listen to him.
At the end of the speech, during the Q&A, a guy asked him which books he considers to be a must in order to improve your dev skills, so here we go with the 9 books a developer should read in his career:

Hope you will have some free time to read these fundamental books!

Note on the picture : I was lucky enough to capture him passing in front of a slide of my opensource library at some point during the speech :)

Thursday, 21 February 2013

Baobab to analyse remote disk space

I just discovered this nice tool to analyze the disk space usage in local or more important, in remote pc.
Baobab is a graphical, menu-driven application to analyse disk usage in any Gnome environment.

This is how my Downloads folder looks like using Baobab

Baobab uses ssh to obtain the data it needs and you can launch it with :

ssh user@url -Y baobab

So you can check easily when some of the remote machines you are working with, is complaining about disk space!

Tuesday, 19 February 2013

ApacheBench - A quick n' easy performance test for your REST applications

ApacheBench is only useful for benchmarking POST or GET requests but is quick to set up and gives details results.

ab – ApacheBench is should be installed by default on Ubuntu

Example:

ab -n 100 -T 'application/xml' -A 'user:password' -p customer.xml -d http://172.18.48.67:8080/ws/1.1.1/resources/customers

-n number of requests

-c number of threads

-T content-type

-A authentication

-p POST filename

The output:

This is ApacheBench, Version 2.3 <$Revision: 655654 $>

Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 172.18.48.67 (be patient).....done

Server Software: Apache-Coyote/1.1

Server Hostname: 172.18.48.67

Server Port: 8080

Document Path: /ws/1.1.1/resources/customers

Document Length: 1035 bytes

Concurrency Level: 1

Time taken for tests: 660.109 seconds

Complete requests: 100

Failed requests: 0

Write errors: 0

Total transferred: 132700 bytes

Total POSTed: 79000

HTML transferred: 103500 bytes

Requests per second: 0.15 [#/sec] (mean)

Time per request: 6601.087 [ms] (mean)

Time per request: 6601.087 [ms] (mean, across all concurrent requests)

Transfer rate: 0.20 [Kbytes/sec] received

0.12 kb/s sent

0.31 kb/s total

Connection Times (ms)

min mean[+/-sd] median max

Connect: 0 0 1.2 0 11

Processing: 5363 6601 810.4 6590 8179

Waiting: 5362 6600 810.2 6590 8178

Total: 5363 6601 810.3 6590 8179

Monday, 11 February 2013

Pool of ssh connections using Apache KeyedObjectPool

I found the org.apache.commons.pool extremely useful and robust, but not well documented.

So, I'll try to help a bit here explaining how to use an Apache KeyedObjectPool.

What is a KeyedObjectPool?

It's a map that contains a pool of instances of multiple types. Each type may be accessed using an arbitrary key.

In this example I'll create a pool of JSch ssh connections and I will use a simple getter setter object called ServerDetails as a key.

Basically for each server I want to have a pool of 10 reusable ssh connections.

So first thing to do is to create a Sessionfactory, a class in charge of creating the actual object you want to store in the pool. In our example that would be an ssh connection.

Sessionfactory needs to extend the BaseKeyedPoolableObjectFactory<K,V> where K is the type of keys in this pool and V is the type of objects held in this pool.

All you need to do is implement the makeObject method where you need to actually create the object in the pool and destroyObject where obviously you need to implement the code when the object is released and put back in the pool.

package org.grep4j.core.command.linux;
import org.apache.commons.pool.BaseKeyedPoolableObjectFactory;
import org.grep4j.core.model.ServerDetails;
import com.jcraft.jsch.JSch;
import com.jcraft.jsch.Session;
import com.jcraft.jsch.UserInfo;
/**
 * This class is used to handle ssh Session inside the pool.
 * 
 * @author Marco Castigliego
 *
 */
public class SessionFactory extends BaseKeyedPoolableObjectFactory<ServerDetails, Session> {

        /**
         * This creates a Session if not already present in the pool.
         */
        @Override
        public Session makeObject(ServerDetails serverDetails) throws Exception {
                Session session = null;
                try {
                        JSch jsch = new JSch();
                        session = jsch.getSession(serverDetails.getUser(), serverDetails.getHost(), serverDetails.getPort());
                        session.setConfig("StrictHostKeyChecking", "no"); // 
                        UserInfo userInfo = new JschUserInfo(serverDetails.getUser(), serverDetails.getPassword());
                        session.setUserInfo(userInfo);
                        session.setTimeout(60000);
                        session.setPassword(serverDetails.getPassword());
                        session.connect();
                } catch (Exception e) {
                        throw new RuntimeException(
                                        "ERROR: Unrecoverable error when trying to connect to serverDetails :  " + serverDetails, e);
                }
                return session;
        }

        /**
         * This is called when closing the pool object
         */
        @Override
        public void destroyObject(ServerDetails serverDetails, Session session) {
                session.disconnect();
        }
}

Second thing you need to do, is to create the actual keyed pool Object. In our example we create a singleton that holds a StackKeyedObjectPool.
The number 10 is a cap on the number of "sleeping" instances in the pool.
If 11 clients try to obtain an ssh connection for the same server, the 11th will wait until one of the first 10 will release his connection.

package org.grep4j.core.command.linux;
import org.apache.commons.pool.KeyedObjectPool;
import org.apache.commons.pool.impl.StackKeyedObjectPool;
import org.grep4j.core.model.ServerDetails;
import com.jcraft.jsch.Session;
/**
 * Pool controller. This class exposes the org.apache.commons.pool.KeyedObjectPool class.
 * 
 * @author Marco Castigliego
 *
 */
public class StackSessionPool {

        private KeyedObjectPool<ServerDetails, Session> pool;

        private static class SingletonHolder {
                public static final StackSessionPool INSTANCE = new StackSessionPool();
        }

        public static StackSessionPool getInstance() {
                return SingletonHolder.INSTANCE;
        }

        private StackSessionPool()
        {
                startPool();
        }

        /**
         * 
         * @return the org.apache.commons.pool.KeyedObjectPool class
         */
        public KeyedObjectPool<ServerDetails, Session> getPool() {
                return pool;
        }

        /**
         * 
         * @return the org.apache.commons.pool.KeyedObjectPool class
         */
        public void startPool() {
                pool = new StackKeyedObjectPool<ServerDetails, Session>(new SessionFactory(), 10);
        }
}

How to use it, it's simple and straightforward.
To obtain a ssh connection from the pool, we just need to call :

StackSessionPool.getInstance().getPool().borrowObject(serverDetails)

where serverDetails is our key (we want a pool of ssh connections per server).

When the connection is not needed anymore we put it back on the pool with :

StackSessionPool.getInstance().getPool().returnObject(serverDetails, session);

package org.grep4j.core.command.linux;

import org.grep4j.core.command.ExecutableCommand;
import org.grep4j.core.model.ServerDetails;
import com.jcraft.jsch.Channel;
import com.jcraft.jsch.ChannelExec;
import com.jcraft.jsch.Session;
/**
 * The SshCommandExecutor uses the net.schmizz.sshj library to execute remote
 * commands.
 * 
 * <ol>
 * <li>Establish a connection using the credential in the {@link serverDetails}</li>
 * <li>Opens a session channel</li>
 * <li>Execute a command on the session</li>
 * <li>Closes the session</li>
 * <li>Disconnects</li>
 * </ol>
 * 
 * @author Marco Castigliego
 * 
 */
public class JschCommandExecutor extends CommandExecutor {

        public JschCommandExecutor(ServerDetails serverDetails) {
                super(serverDetails);
        }

        @Override
        public CommandExecutor execute(ExecutableCommand command) {
                Session session = null;
                Channel channel = null;
                try {

                        session = StackSessionPool.getInstance().getPool()
                                        .borrowObject(serverDetails);
                        //...do stuff
                } catch (Exception e) {
                        throw new RuntimeException(
                                        "ERROR: Unrecoverable error when performing remote command "
                                                        + e.getMessage(), e);
                } finally {
                        if (null != channel && channel.isConnected()) {
                                channel.disconnect();
                        }
                        if (null != session) {
                                try {
                                        StackSessionPool.getInstance().getPool()
                                                        .returnObject(serverDetails, session);
                                } catch (Exception e) {
                                        e.printStackTrace();
                                }
                        }
                }

                return this;
        }
}

Remember to close the pool when you don't need it anymore with
StackSessionPool.getInstance().getPool().close();

Hope this will clarify a bit this extremely helpful and robust Apache API.