RSpec, Mocha, and Shoulda

The latest release of shoulda-matchers, 1.4.2, now requires Mocha.  I’ve been using the RSpec stubbing library, but adding mocha to the project breaks one little part of RSpec stubbing.  While Mocha uses stubs to stub a method and RSpec uses stub, both use unstub if you need to unstub a single method.  This collision causes problems if you’re using RSpec to create the stub, because the call to unstub uses Mocha which then promptly fails to remove the RSpec stub!

The solution is to use the RSpec alias unstub! wherever you need to remove an RSpec stub – since Mocha doesn’t implement unstub!, you can continue to happily use the RSpec stubbing library and still successfully unstub RSpec stubs.

Make declarative_authorization play nicely with fakefs during testing

It appears that the demand-loading of config/authorization_rules.rb has problems when fakefs is turned on. If the first test in a sequence to trigger declarative_authorization checks has fakefs turned on, none of the rules will get loaded. The odds of this are slim, but if you have rspec doing randomization the odds increase if you are rerunning a single spec file.

The workaround is simple. Just force declarative_authorization to load the rules before any tests run. Just add the following to your spec/spec_helper.rb file:

# This ensures authorization rules are loaded prior to tests that use FakeFS
Authorization::Engine.instance

Monkey patching I18n.t in Rails

During the Ruby Hangout on Nov 7th, I suggested that it would be cool if instead of writing t 'path.to.my.string', one could instead write t.path.to.my.string. Josh Szmajda made the mistake of agreeing, and so I decided this would be a good challenge for me to learn some more Ruby tricks.

First of all, I poked into I18n and made a good discovery – the translate method is not particularly useful without passing parameters. This meant that I could use the case where there’s an empty list of parameters to return a Funky Object(TM) that would do something cool with method_missing. Also, I decided that since :t is simply an alias for :translate, I could choose to write my own t that would defer to translate when passed args and do my own thing when it wasn’t. Basically, I could write something like:

require 'i18n'

module I18n
  class << self
    def t(*args)
      if (args.empty?)
        # Do something cool here!
      else
        translate(*args)
      end
    end
  end
end

I did some quick testing and discovered that did indeed work. Now on to the tricky part. Here’s the first version:

require 'i18n'

module I18n
  class << self
    def t(*args)
      if (args.empty?)
        translator = Object.new
        translator.instance_variable_set(:@path, '')
        def translator.method_missing(method_id, *args)
          raise StandardError, "No args please!" unless args.empty?
          @path = "#{@path}.#{method_id.to_s}"
          answer = I18n.translate(@path)
          answer.respond_to?(:keys) ? self : answer
        end
        translator
      else
        translate(*args)
      end
    end
  end
end

This creates a new Object object, then sets the instance variable @path on it to an empty string, and then sets up method_missing on that object to handle arbitrary methods. It looks them up using I18n.translate in whatever path is currently active. If the returned thing behaves like a Hash, then we presume we’re going to need to do this again, and so we just return self so the next method call can follow along. On the other hand, if we gotten to something that doesn’t look like a Hash, then we just return that and we’re done.

The only problem with this approach is that there are a whole bunch of methods defined in Object that can’t be used as keys. For instance, if you have the key inspect in your locale file, you’re out of luck getting to it with this notation.

So I decided to try to swap in BasicObject for Object. This took me a little longer to figure out since I still needed a way to call instance_variable_set. I finally came up with the following. It may not be the best way to handle this, but it’s what I stumbled upon.

module I18n
  class << self
    def t(*args)
      if (args.empty?)
        translator = BasicObject.new
        def translator.__instance_variable_set(sym, obj)
          ::Object.instance_variable_set(sym, obj)
        end
        translator.__instance_variable_set(:@path, '')
        def translator.method_missing(method_id, *args)
          raise StandardError, "No args please!" unless args.empty?
          @path = "#{@path}.#{method_id.to_s}"
          answer = I18n.translate(@path)
          answer.respond_to?(:keys) ? self : answer
        end
        translator
      else
        translate(*args)
      end
    end
  end
end

I add another singleton that can call ::Object.instance_variable_set for me. I have no idea what this syntax is actually doing – I stumbled upon it by looking at the documentation on BasicObject in the Pickaxe Book. But it works! When I use I18n.t.hash_name in irb I get translation missing: en.hash_name.inspect – note that .inspect on the end that results because it tries to look up the inspect message that irb sends the result!

I’m not at all sure whether this whole exercise was a good idea from a production code standpoint, but it was fun!

Using factory_girl with declarative_authorization

I recently added declarative_authorization to a Rails app that uses FactoryGirl with RSpec. As Mark Needham pointed out around two years ago at http://www.markhneedham.com/blog/2010/09/12/ruby-factorygirl-declarative_authorization-random-thoughts/, this can cause problems because FactoryGirl can’t save records without the appropriate authorization. Mark’s suggestion was to use monkey patching to wrap around the create method, but I had some issues attempting to implement this. I suspect the internals of FactoryGirl have changed since Mark posted.

After extensive poking around and trial and error, I stumbled across http://robots.thoughtbot.com/post/23039827914/get-your-callbacks-on-with-factory-girl-3-3, which has a whole discussion about callback support that was added in FactoryGirl 3.3.0 (which was less than six months old at the time of this posting). It first occurred to me to use the before(:create) and after(:create) to set Authorization.current_user, but after I read to the end of the post I realized that to_create was exactly what I needed.

All I had to do was to create the file spec/factories.rb and added the following code to it:

require 'declarative_authorization/maintenance'
include Authorization::TestHelper

FactoryGirl.define do
  to_create do |instance|
    without_access_control { instance.save! }
  end
end

Voila! No monkey patching, and I only have to create one file and all my calls to FactoryGirl.create start working again.

Some Explorations in Functional Programming using Ruby

I’m taking a course in Ruby on Rails, and one of the suggested projects is to write a function that will return the class hierarchy for an object.  This is a very simple problem, but it got me to thinking.  The obvious solution is to use a loop.  But Ruby has a whole set of collection-based idioms using block syntax that seem to inherit from Smalltalk.  I wondered if I might be able to put those to use.

So I started thinking about this whole class of problems.  I wanted a way to express an enumerator where f(n) depends solely on the values of a fixed number of recent values.  For instance, in the case of returning the class hierarchy for an object, f(n) depends solely on f(n-1), whereas in the case of generating the Fibonacci sequence, f(n) depends upon f(n-1) and f(n-2).  One approach to these sort of problems is to use recursion, although in the case of the Fibonacci sequence, recursion can result in performance issues that can be resolved through the use of Memoization.  I wanted to encode the recursive relationship in a non-recursive solution.

So I came up with:

def enumerator_from_generator_simple(*history, &block)
	Enumerator.new do |yielder|
		history.each { |x| yielder.yield x }
		loop do
			history = history[1, history.length-1] + [block.call(*history)]
			yielder.yield history[-1]
		end
	end
end

This takes two things – a list of “seed” values for the history and a block. The block will be passed the history and is expected to generate the next value in the sequence. The number of history values to pass to the block is determined by the number of seed values passed initially. With this method in hand, we can then create an enumerator for the Fibonacci sequence and then extract the first 10 values (be very careful not to try to extract all the values – this is the enumerator for an infinite list):

foo = enumerator_from_generator_simple(1, 1) do
	|*history|
	history[0]+history[1]
end

p foo.first(10)

We can also then write the class hierarchy enumerator:

def get_superclasses(object)
	superclass_enum = enumerator_from_generator_simple(object.class) do |old_class|
		new_class = old_class.superclass
		raise StopIteration if new_class == nil
		new_class
	end
	superclass_enum.to_a
end

p get_superclasses(5)

Note than in this case, I defined the method by having it create a new enumerator using the class of the passed object as the history. I use raise StopIteration to halt the enumeration once we reach a class whose superclass is nil. Voila! No looping, just an expression of the recursive relationship.

My attention then turned to two different areas. The first was adding some performance improvements to the code. These may be a little excessive, but it struck me that the code listed above is a bit inefficient when dealing with some of the most common cases, such as a history of one element. So I wrote the following, which has hand-tweaked code for the most common cases (no elements, one element, and two elements):

def enumerator_from_generator(*history, &block)
	if history.length == 0 then
		Enumerator.new do |yielder|
			loop do
				yielder.yield block.call
			end
		end

	elsif history.length == 1 then
		history = history[0]
		Enumerator.new do |yielder|
			yielder.yield history
			loop do
				history = block.call(history)
				yielder.yield history
			end
		end

	elsif history.length == 2 then
		history_0 = history[0]
		history_1 = history[1]
		Enumerator.new do |yielder|
			yielder.yield history_0
			yielder.yield history_1
			loop do
				history_0, history_1 = history_1, block.call(history_0, history_1)
				yielder.yield history_1
			end
		end

	else
		Enumerator.new do |yielder|
			history.each { |x| yielder.yield x }
			loop do
				history = history[1, history.length-1] + [block.call(*history)]
				yielder.yield history[-1]
			end
		end

	end
end

With that in hand, I turned to some simple testing. As part of testing, I realized that the Fibonacci sequence is actually a special case of a set of Fibonacci sequences. We’ll call the “traditional” Fibonacci sequence Fib2, where Fib2(n) = Fib2(n-1) + Fib2(n-2). The next one in the set of sequences would be Fib3 = Fib3(n-1) + Fib3(n-2) + Fib3(n-3). And so forth and so on. If one uses history.inject(:+) in place of the explicit addition, it becomes simple to express a whole range of these. Finally, I start generating the enumerator using brace block syntax so I can cram everything on one line.

p ( enumerator_from_generator(1, 1) { |*history| history.inject(:+) } ).first(20)
p ( enumerator_from_generator(1, 1, 1) { |*history| history.inject(:+) } ).first(20)
p ( enumerator_from_generator(1, 1, 1, 1) { |*history| history.inject(:+) } ).first(20)
p ( enumerator_from_generator(1, 1, 1, 1, 1) { |*history| history.inject(:+) } ).first(20)

I just increase the number of elements passed initially and, voila, Fibx!

Then I got to thinking that all of this addition was a little excessive. In the case of two or three elements, it’s not bad. But what if I was computing Fib1000? The vast majority of the addition in each step was done in the previous step. I realized that Fibx(n) = 2 * Fibx(n-1) – Fibx(n-1-x). That is to say, each time the very oldest history element drops off and we add a new Fibx(n-1). But since Fibx(n-1) is equal to the whole sum without the changes, we can just double it and subtract the oldest element. Of course, then we need to keep track of x+1 elements in our history and we need to seed the list with an initial sum, but the following enumerates Fib5 just as well as the last example above:

p ( enumerator_from_generator(1, 1, 1, 1, 1, 5) { |*history| history[-1] * 2 - history[0] } ).first(20)

My final realization was that Fibx(n)/Fibx(n-1) has an asymptotic limit – specifically, if the limit as n goes to infinity of Fibx(n)/Fibx(n-1) is r, then r = 2 – 1 / r^x, which can be recast as r^(x+1) – 2 r^x + 1 = 0. For instance, in the case of Fib2, that is r^3 – 2 r^2 + 1 = 0. There are functions for computing cubic roots, but I’m lazy, so I’ll just take advantage of enumerator_from_generator to write a numeric solver!

p ( enumerator_from_generator(2.0) { |history| 2.0 - 1.0 / (history**2) } ).first(20)

Hey, that converges pretty quickly on the Golden Mean! Which means we’re getting the right answer, since it is well documented that the ratio of consecutive values in the Fibonacci sequence converges on the Golden Mean.

But what if I want to enumerate reasonable converged values for Fibx?

def fibx_ratio(x, iter)
	( ( enumerator_from_generator(2.0) { |history| 2.0 - 1.0 / (history**x) } ).first(iter) )[-1]
end

1.upto(10) { |x| p fibx_ratio(x, 100) }

And, as we expect, the ratio for Fibx approaches 2 as x approaches infinity. Which is to say that while Fib2 approximates n^1.618, and Fib3 approximates n^1.839, by the time we get to Fib9 and Fib10, there isn’t much difference between the two functions.

And that, I think, is probably the most I’m going to wring out of this specific suggested project!

RESTful Design and Linguistics

Since emerging from 16 years in a cave, I’ve discovered this design philosophy called REST.  It seems to be popular, although I’m also discovering that there’s a lot of contention around it, and a lot of the descriptions of it seem to miss some of the key points.

I’ve found the following posts to be most helpful in trying to understand the philosophy:

Reading all this stuff got me thinking about REST and how to adapt the OO model that many developers carry around in their heads (nouns and verbs).  The CRUD model maps pretty well onto REST philosophy (although the mapping has some ambiguity in it depending upon the implementation details), but there are a lot of other verbs out there that we need to handle.  The quintessential verb is balance_transfer in a banking application.  The breakthrough came when I realized that verbs can be recast as creating a record of the verb, and the record of the verb is itself a noun.  Thus, instead of executing the balance_transfer action, instead one should create a balance_transfer record.  Creating that record may have side effects (such as updating the balances on the affected accounts).  That record may not even be persisted in the application (although I’d hope that it is in the case of a banking application – audit trails are important).  But from an API standpoint, one is creating a noun, and that is easy to handle in REST.  One way of thinking of this is that in a database with an audit trail that is 100% complete, the current data in the database is simply a persistent cache of the result of committing everything in the audit trail!  As a result, one doesn’t need CRUD at all – one only needs CR on the audit trail!  Every change to the database is simply the result of creating the appropriate record in the audit trail.

All of this collided with a dream I had several months ago.  In the dream I asked a friend what “uninstallable” meant – even in my dream, I’d realized that the word “uninstallable” is ambiguous. It can either mean “unable to be installed” or “able to be uninstalled”!  All excited, I ran to my computer and discovered that someone had written a  paper on the subject - Hierarchical Morphological Structure and Ambiguity (Carl Vikner).  Not only does the paper discuss the ambiguity, but it also brings up a very interesting point. Certain verbs are reversible. For instance, one can unlock a door. One cannot, however, uneat an apple (this is a state change verb, but one cannot undo the state change – the entropy increase is too large) or unyawn (this is a non-state change verb).

So we can divide verbs into three categories:

  • No state change (yawn, wag, etc.)
  • Reversible state change (install, tie, lock, etc.)
  • Irreversible state change (eat, burn, etc.)

In most databases, viewing a record falls into the “no state change” category, but in some databases it falls into one of the latter two categories.  For instance, in a medical records database that complies with HIPAA, viewing a record falls into the “irreversible state change” category because it creates a an audit entry to enable investigations of unauthorized access.  In a mail reader application, viewing (or reading) an e-mail is a reversible state change – it marks the e-mail as read, but there is generally a “Mark as Unread” action.  Note that reversing the action of reading the e-mail has to be referred to by a circumlocution since “unread” is not a valid verb in English!

Which brings up a question.  If viewing has side-effects, should one use GET as the verb?  GET is supposed to be nullipotent, but if viewing creates an audit trail (which is non-idempotent) or marks the record as read (which is idempotent, but not nullipotent), then is GET the appropriate HTTP method?  After some serious thought on this, I’ve concluded the answer is no.  Imagine a browser that does aggressive pre-caching of linked pages.  It can do that because GET is nullipotent.  If, however, viewing records is non-nullipotent, then you want to ensure the browser does not do pre-caching of linked pages.  As a result, all of the links for viewing records should be transitioned to using POST.  This, unfortunately, makes development a lot uglier, but that’s the price one has to pay for ensuring adherence to the browser model.

The next question that occurred to me is how to handle reversible state changes.  For instance, I’m curious how banking systems handle reversing a deposit.  There are two ways of handling this – one is to delete the original deposit record (there might be an audit trail somewhere, but in the transaction log for your account the original deposit simply wouldn’t show).  The other would be to issue a subsequent “undo deposit” transaction.  The latter has the advantage of preserving the original transaction entries as they posted.  The former has the advantage of some degree of simplicity.  However, in both cases there are potential issues that need to be considered.  Imagine I have a balance of $1000 from Jan 1st through Mar 1st.  On Mar 1st, an invalid deposit of $999,000 gets made into my bank account.  On Apr 1st, I finally notice that my account balance seems a tad high and notify my credit union.  They promptly reverse the transaction using either model.  However, I still end up with a higher balance than I would have otherwise!  Why?  Because on Mar 31st, the bank computed interest.  In this hypothetical case, let’s assume I get 1% per quarter (wouldn’t that be nice).  As a result, I get $3,340 in interest on my average balance of $334,000 for the quarter.  If the invalid deposit had never been made, I would have received only $10 in interest.  I assume banking applications have ways of handling this – either all balance-dependent transactions following an invalidated transaction are modified, or if an adjustment transaction is inserted on Apr 1st, then adjustment transactions for all of the balance-dependent transactions are inserted as well.  Issues can be observed in the reverse case as well – imagine invalid withdrawals being made that result in negative-balance fees being applied to the account.  Reversing those invalid withdrawals also needs to clean up the negative-balance fees.

Hello world from Toby Ovod-Everett

I’m a software developer located in Anchorage, Alaska.  I have a strong interest in database front-end development.  While I find web front-ends to have some serious limitations from a UI perspective, I’ve been convinced that the dramatic simplification of deployment and updating outweigh those disadvantages.

Historically, I’ve worked for the last 16 years doing automated workstation management, including building my own framework for deploying software under SMS, developing my own tool sets for package and install/update routine validation, etc.  That was a 16 year detour from my primary interest, though, and I’m excited to be back on track.

Along the way I developed an OO framework for developing ASP web frontends for SQL Server databases using Perl.  The logic tier and most of the presentation tier were encapsulated in an object layer that implemented the database access, business logic, and a majority of the HTML generation and ASP interaction.  That was skinned by very thin ASP pages.  In order to facilitate rapid development and maintenance, a custom data dictionary was developed in a separate SQL Server database.  Perl scripts accessed the data dictionary to do code generation for the 400KB of T-SQL code that defined the database structure, as well as for the Win32::ASP::DB object layer and the ASP pages.  Almost all of the database-specific code was stored in the data dictionary and was injected during the code generation.  In addition, the code generators supported a variety of callbacks stored in the data dictionary that manipulated and extended the structure early in the code generation process, thus permitting repetitive entries to be automatically generated.

As part of that, Ned Konz and I co-developed Class::Prototyped for Perl.  Class::Prototyped is a module that supports fast prototype-based OO programming in Perl.  The semantics came straight out of the Self language, but we stuck to Perlish syntax as much as possible.  It has support for OO reflection and meta-programming, both on objects built using Class::Prototyped as well as arbitrary existing Perl classes.

My long-term interests include:

  • Database design
  • Front-end design
  • Meta-programming
  • DRY

 

However, I’ve come to the conclusion that I need to stop developing custom frameworks from scratch.  So I’ve thrown in my hat with the Ruby on Rails community, and if I find itches in RoR that need scratching, I’ll start writing and coding.