This chapter wrinting good and clean functions - We start off with an example for the FITNess codebase and goes from a very hard to read function

public static String testableHtml(
	PageData pageData,
	boolean includeSuiteSetup
   ) throws Exception {
	 WikiPage wikiPage = pageData.getWikiPage();
	 StringBuffer buffer = new StringBuffer();
	 if (pageData.hasAttribute("Test")) {
		if (includeSuiteSetup) {
			WikiPage suiteSetup =
			PageCrawlerImpl.getInheritedPage(
			        SuiteResponder.SUITE_SETUP_NAME, wikiPage
			);
			if (suiteSetup != null) {
				 WikiPagePath pagePath =
				 suiteSetup.getPageCrawler().getFullPath(suiteSetup);
				 String pagePathName = PathParser.render(pagePath);
				 buffer.append("!include -setup .").append(pagePathName)
					 .append("\n");
			}
		}
		WikiPage setup =
			PageCrawlerImpl.getInheritedPage("SetUp", wikiPage);
		if (setup != null) {
			WikiPagePath setupPath = 
			 wikiPage.getPageCrawler().getFullPath(setup);
			String setupPathName = PathParser.render(setupPath);
			buffer.append("!include -setup .").append(setupPathName)
			 .append("\n");
		}
	}
	buffer.append(pageData.getContent());
	if (pageData.hasAttribute("Test")) {
		WikiPage teardown =
		 PageCrawlerImpl.getInheritedPage("TearDown", wikiPage);
		if (teardown != null) {
			WikiPagePath tearDownPath =
			 wikiPage.getPageCrawler().getFullPath(teardown);
			String tearDownPathName = PathParser.render(tearDownPath);
			buffer.append("\n").append("!include -teardown .")
			 .append(tearDownPathName)
			 .append("\n");
		}
		if (includeSuiteSetup) {
			WikiPage suiteTeardown =
			 PageCrawlerImpl.getInheritedPage(
				 SuiteResponder.SUITE_TEARDOWN_NAME,
				 wikiPage
			 );
			if (suiteTeardown != null) {
				WikiPagePath pagePath =
				 suiteTeardown.getPageCrawler().getFullPath (suiteTeardown);
				String pagePathName = PathParser.render(pagePath);
				buffer.append("!include -teardown .").append(pagePathName)
				 .append("\n");
			}
		}
	}
	pageData.setContent(buffer.toString());			
	return pageData.getHtml();
}

To a very simple function:

public static String renderPageWithSetupsAndTeardowns(
   PageData pageData, boolean isSuite
) throws Exception {
	boolean isTestPage = pageData.hasAttribute("Test");
	if (isTestPage) {
		WikiPage testPage = pageData.getWikiPage();
		StringBuffer newPageContent = new StringBuffer();
		includeSetupPages(testPage, newPageContent, isSuite);
		newPageContent.append(pageData.getContent());
		includeTeardownPages(testPage, newPageContent, isSuite);
		pageData.setContent(newPageContent.toString());
	}
	return pageData.getHtml();
}

What makes this function easy to read and to understand?

Small

The first rule of functions is that they should be small. The second rule is that they should be smaller. How short ? two, three, or even four lines long

Take the very simple function above and turn it in:

public static String renderPageWithSetupsAndTeardowns(
	PageData pageData, boolean isSuite) throws Exception {
	if (isTestPage(pageData))
	includeSetupAndTeardownPages(pageData, isSuite);
	return pageData.getHtml();
}

Blocks and indenting

This implies that blocks withing if/else/while should be one line long. Probably that line should be a function call.

This also implies that functions should not be large enough to hold nested structures.

Do One Thing

Functions should do one thing, they should do it well they should do it only. The problem with that statement is that it is hard to know what ‘one thing’ is. In the LOGO language ‘TO’ is used the same way as ‘def’ in Python this has an interesting effect on way function are designed:

TO RenderPageWithSetupsAndTeardowns

One way to figure out if a function is doing more than one thing is if you can extract another function from it with a name that is not merely a restatement.

Sections within Functions

Functions that do one thing cannot reasonably be divided into sections.

One Level of Abstraction Per Function

In order to make sure our function are doing one thing we need to make sure that the statements within our function are all at the same level of abstraction. Given the example above we can see that for example:

getHtml(); - Very high abstraction
String pagePathName = PathParser.render(pagePath); - Lower abstraction
.append() - Super low.

In this cas reader may not be able to tell wheter a particular expression is an essential concept or a detail.

The Stepdown Rule - From top to bottom

We want the code to be read like a top-down narrative. every function should be followed by those at the next level of abstraction. To say this differently we want to be able to read the program as it were a set of TO paragraphs:


_To include the setups and teardowns, we include setups, then we include the test page content, and then we include the teardowns._

_To include the setups, we include the suite setup if this is a suite, then we include the regular setup._

_To include the suite setup, we search the parent hierarchy for the “SuiteSetUp” page and add an include statement with the path of that page._

Writing function that stays at a single level of abstraction is hard to do. The ‘TO’ technique makes it easier.

Switch Statements

Its hard to make small switch statement - And by nature switch statements cannot do one thing.


public Money calculatePay(Employee e)
  throws InvalidEmployeeType {
	  switch (e.type) {
		  case COMMISSIONED:
			  return calculateCommissionedPay(e);
			case HOURLY:
			  return calculateHourlyPay(e);
			case SALARIED:
			  return calculateSalariedPay(e);
			default:
			  throw new InvalidEmployeeType(e.type);
	}      
}

There are several problems with this functions:

  • Its large
  • It will grow for each new type of employee
  • It violates the Single Responsibility Principle (SRP) because there is more than one reason for it to change
  • It violates the Open Closed Principle (OCP) because it must change whenever new types are added.

The solution to this problem is to bury the switch statement in the basement of an abstract factory and never let anyone sees it. The author’s general rules for switch statements is that they can be tolerated if they appear only once, are used to create polymorphic objects and are hidden behing an inhereitance relationship so that the rest of the system can’t see it.

public abstract class Employee {
	public abstract boolean isPayday();
	public abstract Money calculatePay();
	public abstract void deliverPay(Money pay);
}    
-----------------
public interface EmployeeFactory {
	public Employee makeEmployee(EmployeeRecord r) throws InvalidEmployeeType;
}
---------------
public class EmployeeFactoryImpl implements EmployeeFactory {
	public Employee makeEmployee(EmployeeRecord r) throws InvalidEmployeeType {
		switch (r.type) {
			case COMMISSIONED:
				return new CommissionedEmployee(r) ;
			case HOURLY:
				return new HourlyEmployee(r);
			case SALARIED:
				return new SalariedEmploye(r);
			default:
				throw new InvalidEmployeeType(r.type);
		}
	}
}

Use Descriptive Names

It is hard to overestimate the value of good names. Ward’s principle “You know you are working on clean code when each routine turns out to be pretty much what you expected”. Half the battle to achieving that principle is choosing good names for small functions that do one thing.

Don’t be afraid to make long names. A long descriptive name is better than a short enigmatic name. Don’t be afraid to spend time choosing a name. Choosing descriptive names will clarify the design of the module in mind.

Function Argument

The ideal number of arguments for a function is zero (niladic). Next comes one (monadic) followed closely by two (dyadic).

Three arguments (triadic) should be avoided where possible. More than three (polyadic) requires very special justification and then shouldn’t be used anyway.

An argument could be in a different level of abstraction than the function name and forces you to know a detail that isn’t particuraly important at that point.

Arguments are even harder from a testing point of view. Imagine the difficulty of writing all the test cases to ensure that all the various combinations of arguments work properly.

Output arguments are harder to understand than input arguments. We don’t usually expect information to be going out through the return value.

Common Monadic Forms

There are two very common reasons to pass a single argument into a function:

  • You might be asking a question about that argument - as in boolean fileExists(‘MyFile)‘.
  • Or you may be operating on that argument, transforming it and returning it.

You should pick a name that make the distinction clear between the two.

A somewhat less common use of a single argument function are events. You’ve got input argument but no output. Use this form with care it should be very clear to the reader that this is an event.

Try to avoid any monadic functions that don’t follow these forms.

Flag Argument

Passing a boolean into a function is a truly terrible practice. It immediately complicates the signature of them method and loudly proclaiming that this function does more than one thing.

In the previous example we’ve seen that the function would take a boolean flag as:

render(true)

Changing this to render(boolean isSuite) helps but the best approach would be to split the function in two functions:

renderForSuite()
renderForSingleTest()

Dyadic Functions

A function with two arguments is harder to understand than a monadic function.

writeField(output-Stream, name)

This function requires a short pause until we learn to ignore the first parameter - And that is exactly the problem, we should never ignore any part of the code. The parts we ignore are where the bugs will hide.

There are times where two arguments are appropriate - for example

Point p = new Point(0,0)

This is perfectly reasonable because:

  • Cartesian points naturally take two argument
  • There is a natural order

The second point is very important, take the common use for two arguments:

def test_should_work_as_expected(expected, actual)

How many time have you written the same function (actual, expected) ? There are no natural order so we should set a convention for this

Dyads comes at a cost and you should take advantage of what mechanisms may be available to convert them into monad

Triads

Functions that takes three arguments are significantly harder to understand than dyads, the issues of ordering, pausing and ignoring are more than doubled.

assertEquals(message, expected, actual)

How many times have you read the message and though it was ‘expected’

Argument Objects

When a function seems to need more than two or three argument it is likely that those arguments should be wrapper in a class of their own -


Circle makeCircle(double x, double y, double radius)

vs

Circle makeCircle(Point center, double radius)

Reducing the number of arguments by creating is not cheating - Especially when those concept deserve a name of its own.

Argument Lists

Sometimes we want to pass a variable number of arguments into a function, for example a template for querying data could look like this:


class MyDefaultQuery:
	def default_query(user_id, *columns):
		return select( *columns).where(User.id = user)

All the same rules apply. Function that take variable arguments can be monads, dyads or even triads.

Verbs and Keywords

Choosing good names for a function can go a long way toward explaining the intent of the function and the order/intent of the arguments. In case of monad it should be a very nice verb/noun pair. For example:

def write_field(name):
	...

This is an example of the keyword form of a function name. We’ve read earlier about natural order - This can be a way to set a natural order, for example:

assertEquals(expected, actual)

becomes

assertExpectedEqualsActual(expected, actual)

Have no Side Effects

Side effects are lies. You function promise to do one thing but it also does other hidden things. Given this code:


class UserValidation:
	def check_password(username: str, password: str):
		if user:
			if is_password_okay(user, password):
				db.session.init()
				...
		return False
		...

The side effect of this function is that it calls for a db.session.init when this was never mention and is not implicit by the function’s name.

In this case we might rename the function to something like

def check_password_and_initialize_db(...)

Output Arguments

Arguments are most naturally interpreted as inputs to a function. But sometimes it can be an output - For example a function like

appendFooter(s)

Does this function append s ? or footer to s ? To make sure what this function does you’ll have to check its signature, its a cognitive break and should be avoided.

This need of output arguments almost disappeared in the days of OO because this. is intended to act as an output argument. In other word invoke it as:

s.appendFooter();

Command Query Separation

Functions should either do something or answer something, but not both. Consider this function:

public boolean set(String attribute, String value)

This sets the value of an attribute and return true if successful and false if attribute doesn’t exist - This lead to statements like:

if (set('username', 'unclebob'))...

From the point of view of a reader this doesn’t make much sense you don’t know for sure what it checks. We can try to resolve this by changing then name of the function to something like:

setAndCheckIfExists

But the real solution would be to split that function in two so we can do:

if (attributeExists('username')) {
	setAttribute('username', 'unclebob')
}

Prefer Exceptions to returning Error Codes

Returning error codes from command functions is a subtle violation of command query separation. It promote the use of commands in ‘if’ like such:

if (deletePage(page) == E_OK)

This lead to deeply nested structure. On the other hand by using Exception you can’t prevent those nested structure by grouping this error handling and prevents forcing the caller to treat those error directly -

try {
	deletePage(page);
	registry.deleteReference(page.name);
}
catch (Exception e) {
	logger.log(e.getMessage());
}	

The error is outside of the happy path and can be simplified

Extract try/catch blocks Try/catches are ugly on their own, they confuse the structure of the code and mix error processing with normal processing so its better to extract the bodies of the try catch blocks out of the function:

public void delete(Page page) {  
     try {  
       deletePageAndAllReferences(page);  
     }  
     catch (Exception e) {  
       logError(e);  
     }  
   }  
  
   private void deletePageAndAllReferences(Page page) throws Exception {  
     deletePage(page);  
     registry.deleteReference(page.name);  
     configKeys.deleteKey(page.name.makeKey());  
   }  
  
   private void logError(Exception e) {  
     logger.log(e.getMessage());  
   }

In the above the delete function is all about error processing - deletePageAndAllReference is all about fully deleting a page and logError is all about the error management.

Error Handling Is One Thing Just like any function, error handling is one thing and thus error handling function should just do one thing. If the try/catch block exist in a function it should be the first word and their should be nothing after the catch.

Don’t Repeat Yourself The DRY Principle - While sometime not easy to post because code can be dupplicated throughout the codebase and in different instances. The duplication is a problem because it bloats the code and will require multiple modification should said duplicated code needs to be changed.

Duplication may be the root of all evil in software. Many principles and practices have been created for the purpose of eliminating it. Consider how object-oriented programming serves to concentrate code into base class that would otherwise be redundant.

Structured Programming Edsger Dijkstra’s rules of structured programming - He said that every function, every block should have one entry and one exit - Following these rules means that there should only be one return statement in a function and no break or continue statement in a loop and never ‘goto’ statements. While the rule itself is great it only really has benefits in larger fucntions, if you keep them small than the occasional mutiple return do not harm anything.

How Do You Write Functions Like This? Writing software is like any other kind of writing. First get your ghoughts down, then massage it until it reads well. First draft might be clumsy and disorganized. When writing a function make sure to have test that cover those clumsy line of codes - Then massage it, refine it, split out functions, change names, eliminate duplication and shrink methods - All while the tests are passing.

Conclusion Every system is built from a domain-specific language designed by the programmer to describe that system. The art of programming is, and always has been the art of language design. Master programmers think of systems as stories to be told rathen than programs to be written. This chapter has been all about the mechanics of writing functions well. Following them will get you short, well named, nicely organized functions.