Code and Snow and Stuff

Musings on software and the quality of the white stuff…

Banning URLs from Varnish using Apache Camel and RabbitMQ – Part 2

Welcome Back!

I hope you found Part 1 on this tutorial useful. You should by now have a running instance of Varnish cache along with a running instance of RabbitMQ. You should also have cloned the Varnish-Ban project from Bitbucket and perhaps had a look through the project structure and source code. I hope there is nothing too unusual in there :-).

In today’s posting we will be covering the following topics:

  • The Varnish-Ban Camel Component
  • Configuring Varnish to respond to HTTP BAN requests.

I hope you enjoy the continuing adventure! :-)

The Varnish-Ban Component

Writing a component to hook into Apache Camel is really quite simple. There are various ways of doing it, but I choose a very expicit and straightforward way to achieve the goal of working with Camel. The main requirements were to:

  1. Create a POJO which implements the Component interface.
  2. Create the Service class that will handle the sending of the BAN request to Varnish.
  3. Add a file called varnish-ban into the folder META-INF/services/org/apache/camel/component. This will allow Camel to auto-register the component.
  4. Create a Camel XML file describing the route and the processing requirements that Camel with respond with.

These steps are described below.

Creating the POJO

Writing the POJO was very simple. Below is a screen shot of the the actual class:

A component in Camel is responsible for creating the Endpoints – in effect it is a Factory.  In my configuration, the Component calls the Endpoint which creates a Producer that invokes the VarnishBanServiceImpl class. I like decoupling of code, so it seemed sensible to me to externalise the actual work of the banning mechanism into a service class that does the work. The service class has the responsibility of sending the BAN request to Varnish. The varnishServerUrl is given to us by Camel when it processes the XML configuration file (see below). The main thing here is we don’t have to do any extra work to obtain the varnishServerUrl – it’s all externalised into the XML file.

Creating the Service Class

The VarnishBanService does all the real work. Fortunately for us, even this class is quite small and very straightforward in its functionality. It simply creates an instance of a HTTP Client (from Apache HTTP Components) and sends off our customised HTTP request (a BAN request) to Varnish:

Our customised HttpRequest – the HttpBan class is very simple:

All that is happening here is that we are extending a base HTTP Class (provided by  HTTP Client) and overriding the getMethod invocation to return our customised HTTP method – cleverly called BAN :-). The toString is a simple helper when we are printing out debug/logging messages. You can create your own particular HTTP Method (SUSHI anyone?) if you have different needs. We could have called our method “LOLBAN” if we wanted to :-)

The remainder of the VarnishBanService class just handles the response back from Varnish and prints out some debug/logging information. Please have a look over to understand how it works. There shouldn’t be any surprises. I’m not handling any exceptions here, but what you could do is wrap up the exception into an AMQP message and shove it back into another Queue for another system to process (a monitoring application for example).

Enabling Auto-Discovery of our Component by Camel

If one creates a file with the same name as our chosen URI (see below in the Camel XML  route configuration section to discover what this is all about), then Camel will automagically register our newly created component and make it ready for use. Like so:

The file has one line in it:

This is all that is required to enable auto discovery in Camel. Pretty neat.

Creating the Camel XML Route Configuration

There are several ways to configure Routes in Camel – one is to use Java DSL to wire things together – another way is to use an XML configuration. I choose to use the XML configuration way just to keep things separate. Underneath the hood, Camel uses Spring, so using an XML configuration file seemed like a nice fit as well.

The file consists of the following elements:

  1. The Source Route. This is our connection to RabbitMQ using the Camel-Spring-AMQP component (see the file applicationContext-beans.xml) in the source code.
  2. What do do when a message comes in (send it on the varnish route)
  3. Splitting the XML payload from RabbitMQ using XPath to obtain the URLs that we wish to BAN
  4. Invoking our Varnish-Ban component against a running varnish instance (http://localhost:6081)
  5. Handling any exceptions that may occur. In this example nothing is done, but we could choose to invoke another Camel component to drop an error message into another queue (banQueueError?)

Configuring Varnish for HTTP BAN Requests

Varnish by default does not permit BANs to occur via HTTP requests. To help encourage Varnish to do so, we need to write a bit of VCL (Varnish Control Language). I’ve put the recipe (a complete VCL file) below (this example is also contained with the conf/varnish/default.vcl file in the Varnish-Ban project):

backend default {
    .host = "127.0.0.1";
    .port = "8080";
}

acl purge {
    "localhost";
}

sub vcl_fetch {
    set beresp.ttl = 5m;
}

sub vcl_recv {
    unset req.http.Cookie;
    if (req.request == "BAN") {
        if (client.ip !~ purge) {
            error 401 "Not allowed";
        }
        ban_url(req.url);
        error 200 "Banned " + req.url;
    }
}

Let’s walk through each section:

backend

This is the backend service that Varnish is fronting – in most cases this will be a webserver. Here I’m instructing Varnish to cache requests from a server running on my local machine and listening on port 8080 (Varnish by default listens on port 6081, so if I hit http://localhost:6081 what will actually be served up is content coming from http://localhost:8080).

acl purge

In this section I’m defining an ACL (Access Control List) list of authorised machines that will be allowed to execute a PURGE (an invented name – I could have called it BANNERS if I wanted to). The ACL is used in the VCL_RECV section.

vcl_fetch

A FETCH is the response from the backend – in the sense that Varnish has “fetched” the response and potentially cached it. Here I’m saying to Varnish to cache all backend responses for 5 minutes.

vcl_recv

A REC(ei)V(e) is the request coming into Varnish from a client. The important things to note here are:

  • I’m removing Cookies. By default Varnish does not cache any requests that contain Cookies.
  • We will do something special if the type of the request (from the HTTP HEADER) is a “BAN” type . I invented this type – it could be called another name.
  • We will only allow those clients defined in our ACL the authority to BAN URLs from Varnish – otherwise we return back a 401 (Not Authorised) to the client.
  • Finally we return a 200 back to the Client once we have finished processing the BAN request.

The example VCL should be put into your “default.vcl” and Varnish restarted. When this is done we are ready to move to the final part of this tutorial!

That’s all for now!

Hopefully by now you will have a running application. In the third and last article of this tutorial we will be sending BAN messages to Varnish and observing the results. Until then, have fun!

-=david=-

About these ads

Written by dharrigan

January 5, 2012 at 9:23 am

Posted in development, java, software, varnish

Tagged with , , , ,

Follow

Get every new post delivered to your Inbox.

Join 54 other followers