The Swap Statement Refactoring

Posted July 7, 2016 by Bill Wake

When we wish statements were in a different order, the Swap Statement refactoring sometimes lets us swap them.

For example

        k = 0					a = sin(x) * cos(x)
        a = sin(x) * cos(x)          	b = tan(x) + x * x
        b = tan(x) + x * x			k = 0
        while (k + 3 > 0) ...			while (k + 3 > 0) ...

Why?

Readability: If we can bring related statements closer together, it's easier to understand the flow of information.

In the example above, it's more clear in the transformed code that the loop will be executed at least once. (Moving an initializer closer to its use is a very common case.)

Duplication: If we can rearrange code so that subtle duplication becomes more obvious, then the duplication is easier to remove.

Imagine that the example above had another loop further down, similar to the one involving k. If the two loops are structurally identical (initialize then loop), it's easier to extract a common method.

How?

This refactoring is only legal sometimes. It works like this:

			        ? 
			A ; B       B ; A

(where ?⇔ indicates the transformation can go forward or backward, when it's legal.)

Statements A and B can be simple or compound statements.

I don't know of a tool that implements this refactoring directly. Most editors can do this move with cut and paste (at some risk). Some IDEs have a command to move selected lines up or down; in both JetBrains IDEs and Eclipse it's some variation of Command-Ctrl-Alt-and/or-Shift-Arrow.

When Is This Refactoring Safe?

 
  1. When the statements being swapped have nothing to do with each other, it's generally safe to swap them:

    	a = 0			b = 2
    	b = 2		 	a = 0
    
  2. Values or objects changed by the code in the first statement are not allowed to affect the values or objects read by the second statement:

    	a = 7			b = a + 1      // UNSAFE
    	b = a + 1		a = 7
    
  3. Values or objects changed by the code in the second statement are not allowed to affect the values or objects read by the first statement:

    	a = x			x = 3         // UNSAFE
    	x = 3			a = x
    

    Don't forget to consider objects indirectly being touched:

    	collection.add("foo")			size = collection.size()    // UNSAFE
    	size = collection.size()		collection.add("foo")
    

    (In this example, collection itself is unchanged, but its contents are affected.)

  4. If exceptions can be thrown, we can’t care whether the exception is thrown earlier or later because of the swap.

  5. If concurrency is involved, side effects are less obvious, and the order of access can be critical - and many swaps are not safe:

    Thread 1
    a.lock()
    b.lock()
      critical section for T1
    b.unlock()
    a.unlock()
    
     
    Thread 2
    a.lock()
    b.lock()
      critical section for T2
    b.unlock()
    a.unlock()
    
       
    Thread 1    // UNSAFE
    a.lock()
    b.lock() 
      critical section for T1
    b.unlock()
    a.unlock()
    
     
    Thread 2    // UNSAFE
    b.lock()
    a.lock()
      critical section for T2
    b.unlock()
    a.unlock()
    

    This example demonstrates a classic deadlock. In the upper block of code, both threads acquire the locks in the order "a then b", so deadlock can't occur. In the (non-equivalent) code on the bottom, we can have a situation where Thread 1 acquires lock a, then Thread 2 acquires lock b. When each thread attempts to acquire its second lock, it ends up blocked: deadlock.

An Escape Clause

Sometimes, we may have external information that tells us that a swap is ok even though a common object is affected.

In this example, we know from the semantics of sets that the order of adding elements doesn't matter:

	set.add(a)		set.add(b)
	set.add(b)		set.add(a)

(With no code accessing the set between the add's, the order is irrelevant.)

We may even know from the semantics of our application that the order doesn't matter:

	list.add(a)	?	list.add(b)
	list.add(b)	 	list.add(a)

In general, this would not be allowed, since the list (a, b) is different from (b, a). However, if we have privileged information telling us that our application is treating the list like a set, never depending on ordering, then the move could be safe in that case.

These arguments are beyond the refactoring comfort zone, where refactoring safety is defined in terms of the programming language syntax and semantics, so be extra careful.

Summary

When it's safe to swap the order of two statements, doing so can improve the readability of code and can let us reduce duplication.

It's only safe when the statements change and use independent values and objects, provided exceptions and concurrency are not an issue.