« Back to blog

Using Gearman For Distributed Alerts

At BackType we manage over 30 virtual machines (EC2). We've leveraged the latest technology in cloud computing, storage, and data processing to index over one billion online reactions (comment-like data) and organize those conversations to help users find the latest news and opinions.

When you run dozens of machines, you're inevitably going to want some kind of monitoring in place. There are plenty of existing tools available such as monit, god, daemontools, etc for lower level systems management. Cloudkick provides free tools for managing virtual machines on EC2 and Slicehost with some basic monitoring. At BackType, we use a number of these.

However, as we rapidly deploy new technology and features, we've required more customizable monitoring. Suppose we wanted to keep track of the number of documents in our search index, or get notified if the size of one of our queues spikes over 1000 jobs. We found that we were always wanting a way to monitor very specific things for short periods of time.

Gearman

Gearman is a system to farm out work to other machines, dispatching function calls to machines that are better suited to do work, to do work in parallel, to load balance lots of function calls, or to call functions between languages.

We use Gearman to farm out millions of jobs across multiple machines every single day.

When we were putting monitoring in place, we wanted to avoid installing MTAs on every machine. So it made sense to, instead, leverage Gearman to put together a simple system to deliver alerts.

This is how we did it:

  1. Processes push notification jobs into the Gearman server queue
    import sys, cjson
    from gearman.client import GearmanClient
    from gearman.task import Task
    
    # Replace server IP(s)
    JOB_SERVERS = ['127.0.0.1']
    
    def queueNotification(subject, body):
        client = GearmanClient(JOB_SERVERS)
           client.do_task(Task(func='notify',
                   arg=cjson.encode((subject, body)),
                   background=True))
  2. One or more workers pull the jobs off the queue and deliver them by email:
    import sys, time, cjson
    from gearman.worker import GearmanWorker
    
    # Sendmail location
    SENDMAIL = '/var/qmail/bin/qmail-inject'
    
    # Replace server IP(s)
    JOB_SERVERS = ['127.0.0.1']
    
    def sendEmail(from, to, subject, body):
        p = os.open('%s -f%s' %(SENDMAIL, from), 'w')
        p.write('To: %s\n' %to)
        p.write('Subject: %s\n' %subject)
        p.write('From: %s<%s>\n' %(from, from))
        p.write('Return-Path: %s<%s>\n' %(from, from))
        p.write('Message-ID: <%s-%s>\n' %(time.time(), from))
        p.write('Content-Type: text/plain; charset=utf-8\n')
        p.write(body)
        sts = p.close()
        if sts != 0:
            print 'Sendmail exit status', sts
    
    def sendNotification(job):
        try:
            # decode and extract (subject, body) tuple
            subject, body = cjson.decode(job.arg)
            sendEmail('@'.join(['x', 'example.org']),
                        '@'.join(['y', 'example.org']),
                        subject, body)
        except:
           pass
    
    def main():
        worker = GearmanWorker(JOB_SERVERS)
        worker.register_function('notify', sendNotification)
        while True:
            ret = worker.work()
            if ret != SUCCESS:
                break
    
    if __name__ == '__main__':
        sys.exit(main())

The main benefit is how flexible the solution is and how easily we're able to add monitoring on an application-specific level as well as systems level.

by

| Viewed
times
Follow @BackTypeTech on Twitter!