A Manager for DRb

The other day I wanted to parallelize one of my applications so dusted off DRb (included in ruby’s standard library). While DRb is pretty easy to use, the pain has always been setting up the servers and keeping their code current. This time I wanted the application to handle everything.

So here are my basic requirements:

  • Support using multiple DRb objects
  • Support local and remote hosting of the DRb server(s)
  • Application to install current code on host machine(s)
  • No application code to have to be manually installed on host machine(s)
  • Minimal software requirements on host machine(s)
  • Application to control the DRb server(s) (start/stop)
  • Linux and Mac support (don’t care about windows)

The result is Drbman.

Installing Drbman

sudo gem install royw-drbman –source http://gems.github.com

Using Drbman

Let’s start by coding up the primes example (drbman/examples/primes).

I was wanting something simple for an example so thought the Sieve of Eratosthenes would make a good one. It turns out it’s not really a good candidate for parallelizing via DRb, but is still a good example.

After coding up the sieve and using the ruby-prof profiler, the most time consuming chunk was the code that calculates the multiples of a prime which was taking 42% of the time, so I decided to parallelize it. Moving it to it’s own class yields:

class PrimeHelper
  # Find the multiples of the give prime number that are less than the 
  # given maximum.
  # @example
  #  multiples_of(5,20) => [10, 15]
  # @param [Integer] prime the prime number to find the multiples of
  # @param [Integer] maximum the maximum integer
  # @return [Array] the array of the prime multiples
  def multiples_of(prime, maximum)
    a = []
    2.upto((maximum - 1) / prime) { |i| a << (i * prime) }
    a
  end
end
&#91;/sourcecode&#93;

After getting the PrimeHelper working in the original code, it's time to make it into a DRb server:

&#91;sourcecode language='ruby'&#93;
require 'drbman_server'

# A helper object for calculating primes using the Sieve of Eratosthenes
#
# == Usage
# ruby prime_helper.rb foo.example.com 1234
# will run the service as: druby://foo.example.com:1234
#
# ruby prime_helper.rb foo.example.com
# will run the service as: druby://foo.example.com:9000
#
# ruby prime_helper.rb
# will run the service as: druby://localhost:9000
#
class PrimeHelper
  include DrbmanServer
  
  # Find the multiples of the give prime number that are less than the 
  # given maximum.
  # @example
  #  multiples_of(5,20) => [10, 15]
  # @param [Integer] prime the prime number to find the multiples of
  # @param [Integer] maximum the maximum integer
  # @return [Array<Integer>] the array of the prime multiples
  def multiples_of(prime, maximum)
    a = []
    2.upto((maximum - 1) / prime) { |i| a << (i * prime) }
    a
  end
end

DrbmanServer.start_service(PrimeHelper)
&#91;/sourcecode&#93;

As you can see, not much to it.  Add a <strong>require 'drbman_server'</strong>, then a <strong>include DrbmanServer</strong> in your class, and finally start the service for your class with a <strong>DrbmanServer.start_service(<em>YourClassName</em>)</strong>.  One thing to note here is when we run the helper, it will not have a console connection (no stdin/stdout/stderr) so we would need to include some form of logger that supports alternate output such as syslog, file, email, etc. (personally I'm partial to the log4r gem).

Ok, now let's see how we hook into the client application.  Here's the sieve in our example:


# The Sieve of Eratosthenes prime number finder
# Note, uses the Command design pattern
class SieveOfEratosthenes
  attr_reader :primes_elapse_time

  # Use the Sieve of Eratosthenes to find prime numbers
  #
  # @param [Integer] maximum find all primes lower than this maximum value - REQUIRED.
  # @option choices [Hash<String,String>] :dirs hash of local directories to copy to the host 
  #   machines where key is local source and value is directory on host machine - REQUIRED.
  # @option choices [String] :run the name of the file to run on the host machine - REQUIRED.
  #  This file should start the drb server.  Note, this file will be daemonized before running.
  # @option choices [Array<String>] :hosts (['localhost']) array of host machine descriptions "{user{:password}@}machine{:port}".
  # @option choices [Integer] :port (9000) default port number used to assign to hosts without a port number,
  #  the port number is incremented for each host.
  # @option choices [Array<String>] :gems array of gem names to verify are installed on the host machine.
  #  Note, 'daemons' is always added to this array.
  # @param [Logger] logger the logger to use
  def initialize(maximum, choices, logger)
    @maximum = maximum.to_i
    @choices = choices
    @logger = logger
    
    # we need at least one host that has a drb server running
    @choices[:hosts] = ['localhost'] if @choices[:hosts].blank?
    
    # specify the directories to copy to the host machine
    @choices[:dirs] = {File.join(File.dirname(__FILE__), '../drb_server') => 'drb_server'}

    # set the file to be ran that contains the drb server
    @choices[:run] = 'drb_server/prime_helper.rb' if @choices[:run].blank?
    
    # specify gems required by the drb server object
    # each host will be checked to make sure these gems are installed
    @choices[:gems] = ['log4r']
    
  end
  
  # Calculate the primes
  # @return [Array<Integer&#93; the primes in an Array
  def execute
    result = &#91;&#93;
    @logger.debug { @choices.pretty_inspect }

    Drbman.new(@logger, @choices) do |drbman|
      @primes_elapse_time = elapse do
        result = primes(@maximum, drbman)
      end
    end
    result
  end
  
  private
  
  # recursive prime calculation
  # @param maximum (see #initialize)
  # @param &#91;Drbman&#93; drbman the drb manager instance
  # @return &#91;Array<Integer>] the array of primes
  def primes(maximum, drbman)
    indices = []
    if maximum > 2
      composites = calc_composites(maximum, drbman)
      flat_comps = composites.flatten.uniq
      indices = calc_indices(flat_comps, maximum)
    end
    indices
  end

  # find the composites array
  # @param maximum (see #initialize)
  # @param drbman (see #primes)
  # @return [Array<Integer>] the composites array
  def calc_composites(maximum, drbman)
    # when n = 20
    # sqr_primes = [2,3]
    # composites = [[2*2, 2*3, 2*4,...,2*9], [3*2, 3*3, 3*4,...,3*6]]
    sqr_primes = primes(Math.sqrt(maximum).to_i, drbman)
    composites = []
    threads = []
    mutex = Mutex.new
    sqr_primes.each do |ip|
      # parallelize via threads
      # then use the drb object within the thread
      threads << Thread.new(ip, maximum) do |prime, max|
        drbman.get_object do |prime_helper|
          prime_multiples = prime_helper.multiples_of(prime, max)
          mutex.synchronize do
            composites << prime_multiples
          end
        end
      end
    end
    threads.each {|thrd| thrd.join}
    composites
  end
  
  # sift the indices to find the primes
  # @param &#91;Array<Integer>] flat_comps the flattened composites array
  # @param maximum (see #initialize)
  def calc_indices(flat_comps, maximum)
    indices = []
    flags = Array.new(maximum, true)
    flat_comps.each {|i| flags[i] = false}
    flags.each_index {|i| indices << i if flags&#91;i&#93; }
    indices.shift(2)
    indices
  end
  
end
&#91;/sourcecode&#93;

Let's start with the <strong>@choices</strong>.  The example application uses <a href="http://user-choices.rubyforge.org">UserChoices</a> in cli.rb to handle command line, environment, and/or config file parameters.  UserChoices are close to a hash in that they have <strong>[]</strong> and <strong>[]=</strong> hash like methods.  If you don't use UserChoices, then just pass the parameters in a regular Hash.

Back to the initializer, we first make sure at least one host is define in <strong>@choices[:hosts]</strong>.  Then we set the <strong>@choices[:dirs]</strong> key to the local directory that contains the files we want copied to each host machine and the value to the relative directory on the host machine we want the files copied into.  Next we point to our DRb server file with <strong>@choices[:run]</strong>.  Finally we specify any gems we want to make sure are installed on each host with the <strong>@choices[:gems]</strong> choice.


    # we need at least one host that has a drb server running
    @choices[:hosts] = ['localhost'] if @choices[:hosts].blank?
    
    # specify the directories to copy to the host machine
    @choices[:dirs] = {File.join(File.dirname(__FILE__), '../drb_server') => 'drb_server'}

    # set the file to be ran that contains the drb server
    @choices[:run] = 'drb_server/prime_helper.rb' if @choices[:run].blank?
    
    # specify gems required by the drb server object
    # each host will be checked to make sure these gems are installed
    @choices[:gems] = ['log4r']

Next, take a look at the execute method. Here’s where we run Drbman. Note that we pass a Drbman instance into the primes method.

    Drbman.new(@logger, @choices) do |drbman|
      @primes_elapse_time = elapse do
        result = primes(@maximum, drbman)
      end
    end

Finally skip down to the calc_composities method and inside the threading block:

    sqr_primes.each do |ip|
      # parallelize via threads
      # then use the drb object within the thread
      threads << Thread.new(ip, maximum) do |prime, max|
        drbman.get_object do |prime_helper|
          prime_multiples = prime_helper.multiples_of(prime, max)
          mutex.synchronize do
            composites << prime_multiples
          end
        end
      end
    end
    threads.each {|thrd| thrd.join}
&#91;/sourcecode&#93;

The <strong>drbman.get_object do |prime_helper|</strong> returns a DrbObject that references one of PrimeHelper instances on one of our host machines.  Yep, we don't know which nor really care.  The threading handles the parallel execution.

And that pretty much covers how to use Drbman.

Oh, why is this example of the Sieve of Eratosthenes not a good fit for DRb parallelization?  It turns out that the return value from <strong>PrimeHelper.multiples_of</strong> is an Array that can grow quite large.  This has three problems.  Memory usage can get quite large, the network bandwidth used in returning the marshaled array, and that DRb appears to try to place the entire marshaled return object into a single packet.  While calculating primes of 10,000,000 finds 664,579 primes, trying with 20,000,000 generates the error "too large packet 51578437".

So in choosing what to parallelize, it's probably best to find parts with reasonable sized inputs and outputs.

<h4>Running the Primes Example</h4>

Using 3 machines:
<ul>
<li>royw-macbook 2GHz Intel Core 2 Duo, 2GB</li>
<li>royw-gentoo 2.5GHz Intel Q9300, 4GB</li>
<li>dad-kubuntu 2.4GHz Intel Q6600, 4GB</li>
</ul>


royw-macbook:primes royw$ bin/primes 10000000 -H 'royw-gentoo,royw-gentoo,royw-gentoo,royw-gentoo,dad-kubuntu,dad-kubuntu,dad-kubuntu,dad-kubuntu'
664579 primes found
calculation elapsed time: 49.745109
total elapsed time: 62.603881
royw-macbook:primes royw$ bin/primes 10000000 -H 'royw-gentoo,royw-gentoo,royw-gentoo,royw-gentoo'
664579 primes found
calculation elapsed time: 55.525954
total elapsed time: 58.73239
royw-macbook:primes royw$ bin/primes 10000000 -H 'royw-gentoo'
664579 primes found
calculation elapsed time: 107.114372
total elapsed time: 109.655318
royw-macbook:primes royw$ bin/primes 10000000 -H 'localhost'
664579 primes found
calculation elapsed time: 102.872163
total elapsed time: 107.949127
royw-macbook:primes royw$ bin/primes 10000000 -H 'localhost,localhost'
664579 primes found
calculation elapsed time: 68.445687
total elapsed time: 71.763174

The last two runs should raise an eyebrow or two. The wire cost is pretty high.

Internals

There’s a little bit of the internals to Drbman that you really ought to be aware of.

First, Drbman uses net-ssh to connect to each host machine. It also uses net-scp to copy files to the host machines. This means you must be able to directly log into the host machine using ssh from your client machine (currently gateway sessions are not supported). I’ve tested in two scenarios: passwordless certificates and normal password logins.

Next, Drbman will create a directory, ~/.drbman, on each host and leave it. Then each time your client uses the host, it will create a ~/.drbman/{uuid} directory where it will upload your choices[:dirs] directories. It will also create a controller file in this directory, “#{File.basename(@choices[:run],’.*’)}_controller.rb”, that is used to daemonize and control your DRb server. This controller takes the normal daemon commands (start, stop, status) as well as command line parameters it should pass to your DRb server (after a double hyphen ‘–‘). The command line for our prime example is:

~/.drbman/{uuid} # ruby foo_controller start — example.com 1234

So basic setup flow is:

  1. open ssh connection to host machine
  2. create ~/.drbman/{uuid}
  3. scp @choices[:dirs] to ~/.drbman/{uuid}
  4. create the controller in ~/.drbman/{uuid}
  5. daemonize the @choices[:run] file by starting the controller

The cleanup flow is:

  1. stop the DRb server on the host machine using DRbObject.stop_service
  2. stop the daemon by issuing a controller stop command
  3. delete the project’s directory (~/.drbman/{uuid}).

Note, if you set @config[:leave] to true, then the project’s directory will not be deleted. This can be useful for debugging. I’ve also found this little script useful to gracefully stop any running daemons and delete all the project files in ~/.drbman:

#!/usr/bin/env ruby

# place this script in the ~/.drbman directory
# then use it to stop all daemons and delete
# all projects

Dir.glob("**/*_controller.rb").each do |controller|
  `ruby #{controller} stop`
  unless `ruby #{controller} status` =~ /running\s+\[pid\s\d+\]/
    `rm -rf #{File.dirname(controller)}`
  end
end

Have fun,
Roy

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s