Structure Awareness

As a denizen of Twitter and various Slack channels, I find myself in conversations about refactoring and TDD in which people will discuss the struggles they have in this technical realm.

One of the frequent concerns is that “unit tests1 prevent refactoring.”

This is upsetting since the entire point of TDD is to support refactoring.

Why do people find their tests an obstacle to refactoring?

After many conversations, I find that there is a common reason that many people experience this pain, and that is that their tests (and sometimes production functions) are too structure-aware.

A long, long time ago after TDD and design patterns were emerging (possibly before many modern developers were born) a paper came out recommending a simple rules for keeping your code from being too dependent on the object structure of your application.

This paper made a big splash at the time, and most of us who were programming back then learned, internalized, and semi-forgot the Law of Demeter.

I think that we forget to share and discuss it among ourselves, but most who were around at the time apply it in normal practice. We might call out the train wreck pattern, and may eventually get around to crediting the Law of Demeter for our thinking, but it’s one of those “nearly-forgotten classic” types of ideas.

Far from being quaint and old-fashioned, the Law of Demeter is evergreen and perhaps one of the most important missing ingredients in modern TDD.

Newer TDDers are at a distinct disadvantage if they don’t know the value of structure-shy programming.

Structure-Aware Failure Mode

The code smell related to violating this principle is called “train wreck”:

class A:
   def deliver_message(account, emailer, title, text):
      destination = account.holder.primary_email.address // here
      sender = emailer.server.ip_address                 // here too
      message = emailer.formatter(destination,  title, text)
      emailer.server.post_message(ip_address, message)   // another one

You can see that the above snippet is reaching through the account rather deeply to access a primary email address.

It also is reaching pretty deeply into the emailer object.

'picture of a naughty child reaching inside a vending machine'

Reaching inside can get you in trouble

This is a structure that is made easier by auto-completion in our smart code editors, but is it a good structure?

It is easy to imagine tests that not only reach through objects to deeper functions and attributes, but also which reach rather deeply to form assertions or insert mocks.

If dozens of tests and functions in the system typically reached through the account and emailer in similar ways, how many refactoring changes could cause those functions or tests to be broken and invalid?

It is easy to imagine these tests and methods resisting efforts to refactor. These tests, rather than supporting refactoring as we intend, are reasons to leave the code exactly as it is now; they are reasons to stay wrong.

We don’t want to stay wrong, so what we are looking for is some technique that allows us to change the current structure easily while refactoring.

We need some way to keep the code structure shy.

The Law of Demeter

A lot of us trace our understanding of structure-shy code to the Pragmatic Programmers, Andy Hunt and Dave Thomas, via their article “Keep it DRY, Shy, and Tell the Other Guy.”

Others connected first with The Law Of Demeter from OOPSLA ‘88, written by Lieberherr, Holland, and Riel. I’m in this latter camp and happily discovered Hunt & Thomas later.

The Law of Demeter is not a natural law, but rather a concept that one can choose to follow. We find there is some value in treating the Law of Demeter rigorously, as though it were legally required (though situational judgment always applies).

The core principle is that any unit of code should have limited knowledge of the structure of any other units. This applies to modules, namespaces, libraries, and objects. The PragProgs suggested that this is the very soul of object-oriented programming.

Let’s take the case of methods (member functions) of objects. A method is only allowed to access methods and attributes of:

  • its class
  • its parameters
  • Its local variables (created in the method)
  • globals accessible in the scope of its class2

This has been simplified to a more concise motto: Only Use One Dot. 3

There is an amusing metaphor for violating the Law of Demeter at Erich Dietrich’s blog, and perhaps you will find it enlightening.

How does this play out?

We must not reach through the objects to get their subobjects, so we have to ask the object that is within our reach (see above rules) to reach deeper for us.

This leaves us with pass-through methods.

   def deliver_message(account, emailer, title, text):
      destination = account.primary_email_address()
      message = emailer.post(destination,  title, text)

Earlier the email address was acquired by a train wreck which is clearly in violation of the Law of Demeter.

destination = account.holder.primary_email.address

To eliminate the train wreck, we convert the code to ask the Account for its primary email address. In doing that we renamed a variable or two, but it should map pretty well.

    class Account:
       def primary_email_address(self):
          return self.holder.primary_email_address() 

    class Holder:
       def primary_email_address(self):
          return self.contact.get_primary_email()

    class Contact:
       def get_primary_email(self):
          return self.email_addresses[0]

This series of pass-through functions may seem odd, but it does leave decision-making at the lowest (inmost) layer. The contact knows that the primary email is the zeroeth email in the list and nobody else needs to know it.

Ensuring a decision exists in exactly one place in the codebase is known by other names including Data Hiding, Single Point Of Truth (SPOT), Don’t Repeat Yourself (DRY), and Once And Only Once (OAOO).

It’s funny that the idea of avoiding duplication has so many names.

I suspect it is because this idea was independently discovered and promoted by different people in different places.

Now the Account, Holder, and classes using Account are unaware of the deeper structure of the application beyond the limits that the Law of Demeter allows, so that accounts and holders may be further split or consolidated without their tests or callers being disrupted.

It simply doesn’t matter to our methods what happens behind the scenes of any of its direct collaborators, so long as the right things are still being done.

Weird Classes

While this solves a lot of problems with refactoring, we can immediately raise the concern that all of these pass-through methods might make a lot of really weird-looking classes.

What happens to the “single responsibility principle” when classes gather a bunch of pass-through methods like this?

Here is an account with methods that seem more appropriate to the account-holder and email system! Doesn’t that corrupt the purity of the user interface of the account?

Well, sure.

Take a moment to bask in your awareness of code craft. You have definitely spotted a vector by which a class may have its primary purpose subverted or obscured! Well done!

Deep breaths and pats on the back are appropriate here, and possibly high-fives.

Now take a moment to notice that the classes that use the Account were already accessing the email addresses through the account: the Account was accidentally responsible to clients to provide the whole structure that ultimately delivers the email address!

This is a startling revelation: a sudden appearance of Hyrum’s Law.

Because exposing the application’s inner structure made it possible to depend on the inner structure of Account/Holder/etc, the tests and code of the system became dependent on that structure.

The dependency was implicit rather than explicit. You didn’t see it in the classes’ source files.

You were overlooking all the implicit and duplicated dependencies in your app right up until the moment when you could no longer refactor your code safely.

But now we can change that. By refactoring to the Law of Demeter, we make the implicit explicit and make structural changes possible. All of the actual uses of the class become visible.

Do we create new wrapper or controller objects? Do we shuffle our object design? Do we look at new interfaces?

We can introduce those now. Our options are wide open now that refactoring is back in play.

Guess what else? Now that we’re not structure-aware it is easier to create and use test doubles.

Notes

  1. We don’t say “unit tests” anymore. We refer to TDD-appropriate tests as microtests since they are very specific, tiny, and quick

  2. We may recognize ‘globals’ as a code smell, but we also may find that it’s a hard smell to circumvent sometimes and might be present (if temporarily) in already-existing code. That’s not the subject of this post, though, so we will leave it as-is. 

  3. In python, you refer to your self explicitly so the “only use one dot” rule has to be modified a bit. This line is legit because self.holder is a member variable of the class Account.