Wednesday, June 30, 2010

The six thousand topic man: hosting many topics in the same ActiveMQ

An ActiveMQ user was enquiring about whether ActiveMQ (get it from!) could handle a publish-subscribe messaging architecture with six thousand topics. I've often seen production deployments of FUSE support tens to hundreds of JMS destinations; however, I wasn't quite sure how it would perform with a huge number of topics. Of course, you could reduce your number of topics by introducing message selectors on a smaller number of topics: but that avoids the question rather than answering it up front.

Throwing some questions at the FUSE engineering team got back a lot of confidence that it would indeed work just fine. Still though, I always like to try things and out and see for myself. So, I slapped together a JMS client that wrote 1,000,000 non-persisted messages to 6,000 JMS topics. Then, I put together another JMS client with 6000 consumers, with appropriate session and connection pooling in place. The result? Alarmingly straightforward: it worked just fine! While quietly content with this outcome, it's worth mentioning some background things I did on the Broker...

First, I needed to switch off the default 'thread-per-consumer' model in ActiveMQ. This is done by setting the JVM system variable -Dorg.apache.activemq.UseDedicatedTaskRunner=false - by default, this is set to 'true' in the ./bin/activemq[.bat] startup script. I tested what happens if I leave this to 'true' and indeed I ended up with 6,000 threads in the broker. Ouch. When you disable this setting, the consumers are served from a pool of threads, and total thread count never got above sixty threads.

Next, I configured the Broker's transport connector to use 'nio:' rather than 'tcp:': this means we get a cleaner, more scalable threading model within the broker.

And so, it all works just fine. Dejan Bosonac's article on Python messaging: ActiveMQ and RabbitMQ suggests that you can get up to as much as 32,000 JMS destinations on a single broker; that's good to know, but I can't think of a situation where I'd need that right now.

Tuesday, June 29, 2010

ActiveMQ pooling: a pool by any other name would smell as sweet

Working with a FUSE customer today who voiced confusion at all the different ways you can do a pooled connection factory in ActiveMQ. With confusion comes fear, uncertainty and distrust. Should I be using Jencks? Or should I be using the Spring CachedConnectionFactory? Wasn't there something on the activemq-pool documentation that said it didn't actually pool consumers? Help!

I took this confusion directly to the source, and had an enlightening discussion with James Strachan and Gary Tully at FuseSourceabout this. As a nice drop off, we ended up updating the documentation on Camel ActiveMQ, ActiveMQ Spring Support, and the Javadoc for the org.apache.activemq.pool.PooledConnectionFactory.

The bottom line is this: while Jencks was recommended in the past, it's no longer necessary as you can just use the org.apache.activemq.pool.PooledConnectionFactory from activemq-pool project. Alternatively, you can of course use the Spring CachedConnectionFactory, as outlined in this great article.

Here's the real sneaky thing though: the JavaDoc documentation of org.apache.activemq.pool.PooledConnectionFactory suggested that this connection factory doesn't pool consumers. This is in fact not not a drawback or a failing: it simply doesn't make sense to 'pool' JMS consumers. Maintain a collection of them for concurrent consumption in parallel? Sure! But keep a 'pool' of consumers, whereby you return consumers into the pool for reuse later on when you're finished? Don't do it! It simply doesn't make much sense - and, at a technical level - could end up creating havoc as the 'idle' consumers would still get messages delivered to their internal 'prefetch' queues, where they'd dwell until the consumer is activated again. We updated the documentation to better explain what the PooledConnectionFactory does.

We realized that the confusion comes from a number of outdated resources on the web that mentioned a myriad combination of ways to do pooling. That, and, the fact that it's easy to confuse the different concepts of 'pooling' and simply maintaining a collection of resources: the former involves sharing and reuse, while the latter does not.

Bottom line: forget about Jencks. Use activemq-pool's PooledConnection or Springs CachedConnectionFactory to manage your connection, session and producer pools. And don't go talking about 'consumer pools' - it really doesn't make sense - talk about 'collections of consumers'.