Code Evaluation

Apr 23

Think back for a moment to presentations you've witnessed or conducted in a classroom, a recent programming convention, or a co-worker explaining a concept.

The typical exercise in code evaluation presents a problem in a neat, isolated space, and it looks at one or two dozen lines of code that encapsulates an algorithm. "Look here!" the instructor says, "There's actually a problem right on this line!" The problems in the code is subtle, perhaps language-specific, and the solution seems ingenious once the teacher reveals the solution. On looking at the example long enough, the learner is usually able to recognize that the solution is logical, and if she was to see such a problem in the future, she would be able to apply this knowledge. This makes the learner feel accomplished, and the exercise is complete.

The problem with this class of code evaluations is often extremely domain specific and oversimplified. Often, they would have been easily found with reasonable unit tests. Essentially, the learner has accomplished very little by recognizing the "trick" in this miniscule codebase. The number of subtle problems arising from specific algorithms, be it the pointer misdirection in a complex loop, or an overly complex control flow when a more elegant solution exists, are often tightly related to the problem at hand, and difficult to generalize.

Furthermore, production code is not so easily isolated to a couple dozen lines where we know something is wrong. We can evaluate our algorithms systematically before moving on, but the reality is we write lots of code, and there is no teacher to say "YOU GOT IT" or not. Solving a bug or cleaning up an algorithm so it flows more clearly doesn't mean the code transitions to the "fixed" state. There may very well be more issues, how carefully must we look before we can convince ourselves we have on our hands some magical optimal measure of code? And what happens when a new feature or requirement invades our optimal code and throws everything into a different disorganized state? Not many real-world algorithms live in the dusty libraries of framework that never need to be changed once written. Will we still be able to apply our learnings to clean up this new problem space with an entirely different set of potential defects?

My point is that there are several layers of disingenuousness to such exercises. The number of subtle algorithmic improvements you can learn approaches infinity, so manufacturing these lightbulb moments are frankly futile.

What then is a good code evaluation exercise?

The first thing to recognize is that every evaluation has to be incremental. While there are objectively correct improvements to make to given blocks of code, it must be a waste of time to clean up an algorithm into a pristine, correct function minus a single subtle error, and present that. Real code often has several potential refactor opportunities, may contain bugs of various classes, and might be brittle to the larger feature that contains it. These are the real problem spaces we want to get good at working on as programmers, so let's make examples about those.

I'm tired of looking at manufactured codebases cherry picked and "cleaned up" for presentation purposes. Take a real block of code, that looks like garbage, and that you wrote (yes even you, genius programmer, write some junk), and go through how to improve it incrementally. Acknowledge that if you have refactor ideas, some of them may make the code more flexible to some changes but more brittle to others. Encourage learners to see the big picture, about how errors in assumptions or just straight up errors can be found and solved, but a codebase is never optimal for everything, and just like there are trade-offs in time and space complexity, there are trade-offs in how code is structured.

And I think a very important thing to demonstrate is that, in real code, there is a trade-off between code correctness and programmer time. Unless you are developing software for space flight or heart surgery, it is not feasible to expect that production ready code ships with no errors (well, even that isn't strong enough, it is not feasible to expect any software to ship with no errors).

Real code evaluations of complex, large systems minimize errors but do not eliminate them all, so we must maximize our time solving the more important issues rather than proving our most simple and easily testable algorithms to be correct. Unit tests succinctly validate simple procedures, but programmers will be the ones to simplify or manage complexity.

Any problem that presented should illustrate how we measure the utility of code for different applications, and from there learners will be able to see directions they can take the code and design reasonable solutions on how to maximize flexibility for certain requirements, or how to make errors as apparent as possible through architecture changes. These generalized, instructive approaches are the ones worth sharing.

These principles are easier said than done, and by writing this I'll be pushing myself to make my day-to-day examples up to the standards of this post.

Alex Naraghi

Game Programmer. Augmenter of code.

Code Evaluation

Producers and Processes

Exceptionalism