Why Rubyists Should Care About Messaging (A High Level Intro)

By Jakub Stastny / June 14, 2011

Messaging in the context of application architecture (grandly referred to as message oriented middleware on Wikipedia) is similar to messaging in the real world. If you want to ask your colleague to do something, you'll send him a message of some sort. And if your app needs to ask another app to do something it can do the same, send a message to another app or process to run a command or send an e-mail, for example.

Note: This is a guest post by Jakub Stastny, a member of the RabbitMQ team. Further info can be found at the footer of this post.

There are many reasons for using messaging in your applications. It can help you:

improve response times by doing some tasks asynchronously
reduce complexity by decoupling and isolating applications
build smaller apps that are easier to develop, debug, test, and scale
build multiple apps that each use the most suitable language or framework versus one big monolithic app
get robustness and reliability through message queue persistence
potentially get zero-downtime redeploys
distribute tasks across machines based on load

For the purpose of this article I'm going to use the word "messaging" only for sending messages over some kind of messaging protocol such as AMQP or STOMP. There are some messaging systems which work only within one language, such as JMS for Java, and I'm not going to touch on these.

Messaging Architecture

Most messaging software is implemented as a message broker which is a daemon connecting producers with consumers. Producers send messages and consumers process them. In Web development, a producer is usually a frontend which based on user actions generates tasks, whereas a consumer is usually a backend executing those tasks. Examples of a messaging broker are RabbitMQ and ActiveMQ. However, a broker isn't strictly required. For example, ZeroMQ provides only a socket-like API (this white paper explains more about broker vs brokerless systems).

Basic schema of how messaging works.

A broker isn't only a dumb storage of tasks - it can do a lot more. An important feature is advanced routing, giving you the power to route one message to one or multiple queues based on configuration or even based on some pattern in the messages. The part of the broker which takes care of the routing is called an exchange in the case of AMQP.

Schema of how AMQP works.

How Can You Benefit from Using Messaging?

Reliability & Robustness

Now you might wonder: "Isn't a background thread enough?" What if the application crashes? Most of the messaging brokers support some form of persistency, so even if the server is restarted, no data are lost¹. Messaging protocols often support 'acknowledgements' too, which means that a task is considered to be finished only if the client sends confirmation that everything went OK.

¹ It might be tricky in case the broker is killed, but there's usually a solution for that as well. For example AMQP supports transactions and RabbitMQ provides publisher confirms which is a fast, asynchronous way to be notified that the message was published successfully. Persistent messages are confirmed when all queues have either delivered the message and received an acknowledgement, or persisted the message.

Decoupling

With a message based infrastructure, different parts of your app can easily communicate to each other, making it simple to decouple your app into a few smaller ones. I believe this is really crucial, because it makes the design much better, it makes a lot of stuff simpler and gives a natural progression for scaling.

If your apps are separated, you don't have to write everything in one language, hence you can choose the right tool for the right job. You can connect your new apps in Ruby with your legacy apps in, let's say PHP. You don't have to rewrite the whole ecosystem of apps and specific problems can be solved using Java, Erlang or C if you need better performance or scalability. This isolation can also make it easier for different people to work on different apps, as long as the messaging scheme is agreed upon. (Hello, outsourcing!)

The pain of deploying of large apps can also, in many cases, be reduced. Designed correctly, a heavily decoupled system made of several parts is less likely to come crashing down like a house of cards. Instead, you might have a few component apps dying and only having a cosmetic effect on the larger app overall.

And a bonus: because the apps are isolated, you can easily see the input and output of them, therefore it's easy to inspect and debug them.

#!/usr/bin/env ruby
# encoding: utf-8

require "amqp"

EventMachine.run do
  AMQP.connect do |connection|
    channel  = AMQP::Channel.new(connection)
    queue    = channel.queue("", auto_delete: true)
    exchange = channel.direct("")

    exchange.publish "Hello RubyInside readers!", key: queue.name
  end
end

This from com.rabbitmq.tools.Tracer tool of rabbitmq-java-client showing how we can easily inspect the code above. Another tool you can use for this purpose is Wireshark.

Now imagine everyone starts to use your application. You've suddenly become rich, you buy beers to everyone and you hire the top people of your community. But then what happens? Oh no, an angry unicorn!

Right, it's time to scale. But how? The frontend is fairly simple, but there's a lot of stuff going on on the backend: sending e-mails, processing images, running some tasks ... you can add new instances, but it won't reduce complexity, and it won't be very efficient as different parts of the app have different performance requirements.

So instead you can use decoupling and split your app into multiple separate services. Such applications are very easy to scale and because they're small, they're also way easier to test and the isolation makes it easy to rewrite some parts in case you have so big load that you need to rewrite the critical bits into Java or C.

Scaling by adding new instances.

Scaling by adding new task consumers.

Faster Response Times

Most of the Web apps nowadays are too synchronous. If you upload an image, you might sit there waiting for the thumbnails to be made. It's slow and puts a lot of load on the frontend of a Web app. With a message based architecture, the frontend could instead publish a message saying "Please resize me image XY" (well, in a slightly more technical way ;-)) and leave it be. The same applies for many other situations: sending e-mails, following other users etc.

If it can't get an instant response and deliver that to the user, put it into a message and pop it on a queue to be done later. Most larger sites and services have to do this, so if you can bake it into your smaller app, you'll give yourself a longer runway.

Avoiding Unnecessary Downtimes

Another advantage of messaging is that, if designed properly, you could experience no downtime when redeploying backend services. Consider the scenario with uploading images I mentioned previously. If the communication were synchronous and the "scaler" were down, any request to the service would fail because it couldn't respond. With messaging, you don't have to care. You'd just publish the task and once the service is back online, it will "catch up" and process all of the images.

Communicating with service over HTTP. If the service goes down, the frontend won't work.

Communicating with a service over a messaging broker. If the service goes down, the frontend can still work, because once the service goes online, it'll catch up on the messages which have been sent before.

But I Can Just Use HTTP, Right?

In the Ruby community HTTP is very popular but often overused. It always depends on your use-case. For example, if you need more advanced routing like 1:n or n:m, HTTP has little to offer. If you need asynchronous functionality or loosely coupled components like in case of pub/sub pattern, again, HTTP isn't usually a good choice.

Downsides of Messaging

On the other hand, messaging infrastructures have their own downsides. First of all there are the innate downsides of going distributed at all such as increased reliance on the network and systems administrators (networks can and do go down, even within a single machine). Then there's quite some code for handling reconnection like redeclaring non-durable queues and exchanges and also you have to accumulate messages until network connection is up again (though a good broker will deal with much of this).

One of the most important features of the upcoming AMQP 0.8 gem is significantly improved error handling, so these problems aren't fatal, but you should mind them before making the decision whether the architecture suits your application or not. In most cases it's a fair price for the advantages of a messaging-based architecture, but in some cases you might find it better to just use a synchronous approach.

Routing via HTTP with sending request/response for each client/message.

Routing via a messaging broker when you send a message to the broker and it takes care about the rest.

My Presentation

I recently gave a talk giving an introduction to messaging, along similar lines to this article. The slides are embedded below for your reference:

WTF Is Messaging And Why You Should Use It?

View more presentations from botanicus

I'd like to thank Michael Klishin for his suggestions about improving this post.

This post was by Jakub Stastny. Jakub is a Ruby contractor currently working for the RabbitMQ team of VMware with the mission to make Ruby developers more aware of messaging. He created the Rango framework, the first Ruby framework with template inheritance and has contributed to many well-known projects such as RubyGems, rSpec and Merb. He has a blog 101ideas.cz where he writes about IT and self-development stuff and he tweets as @botanicus.

Comments

Scott Parrish says:
June 15, 2011 at 2:21 am
lets say I want to start using messaging. And I'm looking at using solely in 2 contexts.
1. different processes on the same machine on an Ubuntu system(s)
2. different machines on a LAN.

What messaging brokers best serves each context?

And is DBus in Ubuntu a messeging solution in this same vein or is something different?

thanks for the great article. I can see the usefulness of this approach. in things I'm working on now even.
Joe Van Dyk says:
June 15, 2011 at 2:50 am
In Ruby, why would you want to use an async approach (i.e. the ampq gem) instead of something synchronous (i.e. the carrot gem @ https://github.com/ruby-amqp/bunny)?

Carrot seems much simpler. I don't understand the advantages of async for a typical Ruby web application.
Nick says:
June 15, 2011 at 7:16 am
Great Article!
Totally unrelated, may I ask what you used to create those schemas? Have been looking for something to create pretty flowcharts with.
Michael says:
June 15, 2011 at 8:08 am
AMQP gem no longer lives at rubyforge.org. Please correct link to http://github.com/ruby-amqp/amqp, documentation is at http://bit.ly/amqp-gem-docs.
Michael says:
June 15, 2011 at 8:20 am
@JoeVanDyk: I won't judge what a "typical Ruby web application" looks like but many people use Bunny (what you referred to as "carrot") in Web apps to publish messages and standalone apps use amqp gem. That's why people over at github.com/ruby-amqp maintain both libraries and reach out to all those people who forked Bunny to make a few changes that were never merged back, especially lots of improvements by the Xing engineers.

If you prefer Bunny, use it. It is going strong and is not abandoned. But I encourage people to read through amqp gem documentation guides, at least Getting Started guide, before they declare amqp gem as "not easy to use". This is largely a problem of documentation quality and this is where most of the effort goes now for the amqp gem. Up to the point that I am rewriting EventMachine documentation.

I hope that answers your question.
Michael says:
June 15, 2011 at 8:31 am
Finally, Ruby amqp ecosystem updates (from releases to new features to documentation updates) are published at twitter.com/rubyamqp.
Jakub Stastny says:
June 15, 2011 at 11:42 am
Joe, there are reasons for using async particularly for messaging:

1) Messaging protocols such as AMQP are async themselves, so async code is a natural fit.
2) Generally async code is much faster.

Anyway you'd be pleased to hear that I'm going to start to work on Bunny soon. We don't plan to support Carrot at least for the time being, but we want to port all the goodies of amq-client to bunny, so it'll get more granular API and it'll be much faster as well, because amq-protocol parser we've written is way more faster than the one the old AMQP gem and Bunny use (see Ruby AMQP benchmarks).
Joe Van Dyk says:
June 15, 2011 at 4:41 pm
Bah, I got my rabbit names mixed up. Yes, I meant bunny, not carrot.
Joe Van Dyk says:
June 15, 2011 at 6:42 pm
Yes, I meant Bunny, not Carrot, sorry. Got my rabbit names mixed up.
skrat says:
June 15, 2011 at 7:54 pm
Great post! I like the emphasis on cross platform and standardization. We run Rails frontend and Celery for async. job running. With just a bit of glue (https://github.com/skrat/celerb) it works like H 2 the O.
terrcin says:
June 16, 2011 at 12:48 am
you can install hadoop on a mac with brew:
postmodern says:
June 16, 2011 at 2:28 am
I wonder how ZeroMQ is doing in Ruby (zmq and ffi-rzmq). ZeroMQ is a simpler and more self-contained alternative to AMQP. Although, last time I played with it there were some Threading issues.
Milan Dobrota says:
June 16, 2011 at 6:53 pm
Awesome post! I use AMQP gem/RabbitMQ in the project I am currently working on and it works well so far. One of my pain points is getting the asynchronous code properly tested.
rick says:
June 17, 2011 at 2:56 pm
postmodern: ZeroMQ isn't supposed to work across threads. You can create ZeroMQ sockets to communicate between threads using the inter-process transport though.

The ZeroMQ Guide is a great read for anyone hungry for more about messaging systems after reading this.
Dewayne VanHoozer says:
June 20, 2011 at 7:43 pm
Messaging between apps has played an extreamly important part of our work at the VRSIL (search youtube) within Lockheed. I have proposed a presentation for LSRC and RubyConf 2011 which focuses on our 5 years of integrating legacy apps and new apps using Ruby. We developed our Intergrated System Environment around the concept of smart messages and plug in messade routers. It works great for rapidly integrating new systems...

btw RabitMQ was the latest protocol/router we added to ISE in support on some commercial apps... and may I just say that it is an impressive bit of work.

If my proposal gets accepted the session title will be something like "The Secret Life of Ruby: Warrior With a Cause"

Dewayne
o-*
Dewayne
o-*
Tobin Harris says:
July 9, 2011 at 4:17 pm
Nice. Are the diagrams from http://yuml.me :) ?
GuyBoertje says:
July 15, 2011 at 5:53 am
At MusicGlue we are building our next gen platform that uses ZeroMQ for transient synchronous request/reply and RabbitMQ for persistent async pub/sub in jruby without EM.