Log in

No account? Create an account
entries friends calendar profile Elf Sternberg's Pendorwright Projects Previous Previous Next Next
When unit tests don't catch what's really important - Elf M. Sternberg
When unit tests don't catch what's really important

The following code is for a function called metacadr, which takes an argument in the form of “c[ad]*r” and returns a function that will traverse a linked list, for which the payload of some nodes may themselves be linked lists, creating a linked tree, and return an arbitrary value from an arbitrary node.  This code is in coffeescript, but if you read javascript, it’s pretty readable. The function car means “Give me the value of this node” and cdr means “Give me the handle to this node’s next node.”

metacadr = (m) ->
  seq = m.match(/c([ad]+)r/)[1].split('')
  return (l) ->
    inner = (l, s) ->
      return nil if nilp l
      return l if s.length == 0
      inner ((if s.pop() == 'a' then car else cdr)(l)), s
    inner(l, seq)

I wrote several unit tests with empty lists, simple lists, and tree lists, and all of them reported this function was fine. So when I began to use it and the library it supports in production, it immediately created wild errors deep inside other pieces of code. Can you tell what’s wrong with it?

The unit tests only ran metacadr() once per try. It was only trying to assert that the function it created would process its sequence of arguments correctly. What I didn’t test was this: did the function returned behave correctly when used multiple times?

You see that .pop() in there? It’s a disaster waiting to happen. It’s not a traversal, it’s a mutation. Worse, it’s a casual mutation in a language that always (always!) uses pass-by-reference. When I call inner(l, seq), I kinda expected seq to be copied, but it wasn’t; only the reference was copied. I expected I was calling .pop() on something safe and reusable, but it had a handle on my original sequence, which it destroyed. The function returned by metacadr only worked once.

Lessons learned for Javascript:

  • Casual mutation and call-by-reference are deadly together

  • Using .pop(), .push(), .shift(), .unshift() is almost always the wrong thing to do

  • If your function returns a function, you must test the returned function multiple times to make sure its behavior isn’t self-destructive

In case you’re wondering, the function now looks like this:

metacadr = (m) ->
  ops = {'a': car, 'd': cdr}
  seq = vectorToList (m.match(/c([ad]+)r/)[1].split(''))
  (l) ->
    inner = (l, s) ->
      return l if (nilp l) or (nilp s)
      inner ops[(car s)](l), (cdr s)
    inner l, seq

Improvements: I’ve replaced the ‘if’ expression with a lookup table. This removes a problematic codepath and encourages more defensive programming. By using a linked list I’ve correctly created a traversal sequence that is non-destructive.

I could have gone with an iterative solution, keeping an index, and all the usual Javascriptiness, but for me, this is the elegant (and hence correct) solution.

Leave a comment