Saturday, January 27, 2018

Star Trek Anti-Pattern

In previous installments, we tried to limit the number of Threads to the minimum, even to one. But what happens if you go the opposite way?
startrek.jpg
Often, we try to parallelize some process, but do not want to pay too much attention to the number of Threads. We can try to spawn a new Thread each time we need one:

Executor myExecutor = Executors.newCachedThreadPool();

Often, things will work correctly at first. But as soon as the amount of data increases, you’ll get the following error:

OutOfMemoryError: unable to create new native thread

You probably know this story: support team has a problem on the tool, so they have a look at the logs. They see this message, so they call you to ask what is the parameter to increase your application’s memory. You tell them about Xmx, but 5 minutes later, they call you again to ask how to run the application in 64 bits. That’s when you get suspicious, and ask for the logs.

In fact, this message is quite misleading for the unawares. You have to know that Java reserves some memory for storing Thread stacks, and that is the memory spaces that got scarce. One way to solve it is by decreasing the Xss, the size of a Thread stack in memory, but the best way is still to avoid going to infinity and beyond, that is to avoid the Star Trek Anti-Pattern.

So how many threads should I have? We’ll talk about it in the Medusa Pattern.

Sunday, January 7, 2018

Zebra Pattern

Last time I explained the One Ring Pattern, where only one thread handles the data coming from a queue. I also explained the reasons why such a pattern might be preferred.Zebra.png


One of those reasons is when ordering is important. Some events must keep the order in which they arrive in the queue, so handling them with one thread is a must. But what If there are many events, and I would really like to handle them on several threads? And what if the ordering is not compulsory between all events, but only between some of them? Using again the example from last time with market prices, order must be kept between prices on the same instruments. However, prices coming for different instruments can be handled in any order.

That’s when the Zebra Pattern comes handy. Imagine that each stripe of the Zebra is a different thread, with its own queue. When a price arrives, you put it on the queue reserved for this instrument. In that way, prices for one instrument will be ordered between them, while different instruments will be handled by different threads. To reduce the number of threads, you can use some modulo calculation, using an algorithm similar to the way hashmaps are dispatching keys between their buckets.

If this Pattern interests you, have a look at Heinz Kabutz’s Striped Executor Service.

We have so far tried to tie execution of data to one thread. What if we go the opposite way? Wait for the Star Trek Anti-Pattern.

Monday, January 1, 2018

One Ring Pattern


Last time, I finished talking about all my Queue Patterns. Now I’ll start with my Thread Patterns. Once you have your queues filled with tasks, the question that arises is: how many Threads do I need to deal with them?oneRing.gif
Often enough, the answer to this question is: only one. The unique Thread. One Thread to bring them all and in the darkness bind them.

Executor myExecutor = Executors.newSingleThreadExecutor();

But why would I want only one Thread while I could maybe go faster with several working in parallel? The first reason is that if I come from a place where there was no queues, and I just introduced one (see the Marsupilami Pattern), using only one Thread means less changes to my code. Everything will work more or less as before in this part of the code. There is less chance of introducing a regression.
Secondly, one thread means no concurrency, and therefore no synchronization problems. This also means simpler code, less bugs and less maintenance. Plus, if you’ve read Martin Thompson’s blog Mechanical Sympathy, you probably heard that one big performance problems of having several Threads accessing the same queue is contention. So there are real cases where using one Thread brings better performances.
Another reason for using one Thread is that, even if you have many Threads processing the data at hyper speed, there might be only one Thread having to deal with the consequences at the end. For instance, if you are developing a GUI, there is only one Event Thread for drawing everything, and having several Threads dropping more and more data at it will not serve your purpose.
Last, an important reason for keeping only one Thread is when ordering is important. For instance, if you have an application that displays prices from the market, and you have some very volatile instrument, if many updates are handled by several Threads, a newer price might be handled faster than an older one, and you will end up with your older price displayed in the end.
Even if you feel you are stuck with one Thread because of ordering, a solution still exists. I’ll describe it in the Zebra Pattern next time.