Why was this written?
Reasoning about code means being able to follow the execution path (“running the program in your head”) while knowing what the goal of the code is.
- The presence of “voodoo code”, or code that has no effect on the goal of the program but is diligently maintained anyway (such as initializing variables that are never used, calling functions that are irrelevant to the goal, producing output that is not used, etc.)
- Executing idempotent functions multiple times (eg: calling the save() function multiple times “just to be sure”)
- Fixing bugs by writing code that overwrites the result of the faulty code
- “Yo-Yo code” that converts a value into a different representation, then converts it back to where it started (eg: converting a decimal into a string and then back into a decimal, or padding a string and then trimming it)
- “Bulldozer code” that gives the appearance of refactoring by breaking out chunks into subroutines, but that are impossible to reuse in another context (very high cohesion)
To get over this deficiency a programmer can practice by using the IDE’s own debugger as an aide, if it has the ability to step through the code one line at a time. In Visual Studio, for example, this means setting a breakpoint at the beginning of the problem area and stepping through with the ‘F11’ key, inspecting the value of variables–before and after they change–until you understand what the code is doing. If the target environment doesn’t have such a feature, then do your practice-work in one that does.
The goal is to reach a point where you no longer need the debugger to be able to follow the flow of code in your head, and where you are patient enough to think about what the code is doing to the state of the program. The reward is the ability to identify redundant and unnecessary code, as well as how to find bugs in existing code without having to re-implement the whole routine from scratch.
Object Oriented Programming is an example of a language model, as is Functional or Declarative programming. They’re each significantly different from procedural or imperative programming, just as procedural programming is significantly different from assembly or GOTO-based programming. Then there are languages which follow a major programming model (such as OOP) but introduce their own improvements such as list comprehensions, generics, duck-typing, etc.
- Using whatever syntax is necessary to break out of the model, then writing the remainder of the program in their familiar language’s style
- (OOP) Attempting to call non-static functions or variables in uninstantiated classes, and having difficulty understanding why it won’t compile
- (OOP) Writing lots of “xxxxxManager” classes that contain all of the methods for manipulating the fields of objects that have little or no methods of their own
- (Relational) Treating a relational database as an object store and performing all joins and relation enforcement in client code
- (Functional) Creating multiple versions of the same algorithm to handle different types or operators, rather than passing high-level functions to a generic implementation
- (Functional) Manually caching the results of a deterministic function on platforms that do it automatically (such as SQL and Haskell)
- Using cut-n-paste code from someone else’s program to deal with I/O and Monads
- (Declarative) Setting individual values in imperative code rather than using data-binding
If your skills deficiency is a product of ineffective teaching or studying, then an alternative teacher is the compiler itself. There is no more effective way of learning a new programming model than starting a new project and committing yourself to use whatever the new constructs are, intelligently or not. You also need to practice explaining the model’s features in crude terms of whatever you are familiar with, then recursively building on your new vocabulary until you understand the subtleties as well. For example:
Phase 1: “OOP is just records with methods”
Phase 2: “OOP methods are just functions running in a mini-program with its own global variables”
Phase 3: “The global variables are called fields, some of which are private and invisible from outside the mini-program”
Phase 4: “The idea of having private and public elements is to hide implementation details and expose a clean interface, and this is called Encapsulation”
Phase 5: “Encapsulation means my business logic doesn’t need to be polluted with implementation details”
Phase 5 looks the same for all languages, since they are all really trying to get the programmer to the point where he can express the intent of the program without burying it in the specifics of how. Take functional programming as another example:
Phase 1: “Functional programming is just doing everything by chaining deterministic functions together”
Phase 2: “When the functions are deterministic the compiler can predict when it can cache results or skip evaluation, and even when it’s safe to prematurely stop evaluation”
Phase 3: “In order to support Lazy and Partial Evaluation, the compiler requires that functions are defined in terms of how to transform a single parameter, sometimes into another function. This is called Currying”
Phase 4: “Sometimes the compiler can do the Currying for me”
Phase 5: “By letting the compiler figure out the mundane details, I can write programs by describing what I want, rather than how to give it to me”
Modern languages and frameworks now come with an awesome breadth and depth of built-in commands and features, with some leading frameworks (Java, .Net, Cocoa) being too large to expect any programmer, even a good one, to learn in anything less than a few years. But a good programmer will search for a built-in function that does what they need before they begin to roll their own, and excellent programmers have the skill to break-down and identify the abstract problems in their task, then search for existing frameworks, patterns, models and languages that can be adapted before they even begin to design the program.
These are only indicative of the problem if they continue to appear in the programmer’s work long after he should have mastered the new platform.
- Re-inventing or laboring without basic mechanisms that are built-into the language, such as events-and-handlers or regular expressions
- Re-inventing classes and functions that are built-into the framework (eg: timers, collections, sorting and searching algorithms) *
- “Email me teh code, plz” messages posted to help forums
- “Roundabout code” that accomplishes in many instructions what could be done with far fewer (eg: rounding a number by converting a decimal into a formatted string, then converting the string back into a decimal)
- Persistently using old-fashioned techniques even when new techniques are better in those situations (eg: still writes named delegate functions instead of using lambda expressions)
- Having a stark “comfort zone”, and going to extreme lengths to solve complex problems with primitives
* – Accidental duplication will also happen, proportionate to the size of the framework, so judge by degree. Someone who hand-rolls a linked list might Know What They Are Doing, but someone who hand-rolls their own StrCpy() probably does not.
A programmer can’t acquire this kind of knowledge without slowing down, and it’s likely that he’s been in a rush to get each function working by whatever means necessary. He needs to have the platform’s technical reference handy and be able to look through it with minimal effort, which can mean either having a hard copy of it on the desk right next to the keyboard, or having a second monitor dedicated to a browser. To get into the habit initially, he should refactor his old code with the goal of reducing its instruction count by 10:1 or more.
If you don’t understand pointers then there is a very shallow ceiling on the types of programs you can write, as the concept of pointers enables the creation of complex data structures and efficient APIs. Managed languages use references instead of pointers, which are similar but add automatic dereferencing and prohibit pointer arithmetic to eliminate certain classes of bugs. They are still similar enough, however, that a failure to grasp the concept will be reflected in poor data-structure design and bugs that trace back to the difference between pass-by-value and pass-by-reference in method calls.
- Failure to implement a linked list, or write code that inserts/deletes nodes from linked list or tree without losing data
- Allocating arbitrarily big arrays for variable-length collections and maintaining a separate collection-size counter, rather than using a dynamic data structure
- Inability to find or fix bugs caused by mistakenly performing arithmetic on pointers
- Modifying the dereferenced values from pointers passed as the parameters to a function, and not expecting it to change the values in the scope outside the function
- Making a copy of a pointer, changing the dereferenced value via the copy, then assuming the original pointer still points to the old value
- Serializing a pointer to the disk or network when it should have been the dereferenced value
- Sorting an array of pointers by performing the comparison on the pointers themselves
“A friend of mine named Joe was staying somewhere else in the hotel and I didn’t know his room number. But I did know which room his acquaintance, Frank, was staying in. So I went up there and knocked on his door and asked him, ‘Where’s Joe staying?’ Frank didn’t know, but he did know which room Joe’s co-worker, Theodore, was staying in, and gave me that room number instead. So I went to Theodore’s room and asked him where Joe was staying, and Theodore told me that Joe was in Room 414. And that, in fact, is where Joe was.”
Pointers can be described with many different metaphors, and data structures into many analogies. The above is a simple analogy for a linked list, and anybody can invent their own, even if they aren’t programmers. The comprehension failure doesn’t occur when pointers are described, so you can’t describe them any more thoroughly than they already have been. It fails when the programmer then tries to visualize what’s going on in the computer’s memory and gets it conflated with their understanding of regular variables, which are very similar. It may help to translate the code into a simple story to help reason about what’s going on, until the distinction clicks and the programmer can visualize pointers and the data structures they enable as intuitively as scalar values and arrays.
The idea of recursion is easy enough to understand, but programmers often have problems imagining the result of a recursive operation in their minds, or how a complex result can be computed with a simple function. This makes it harder to design a recursive function because you have trouble picturing “where you are” when you come to writing the test for the base condition or the parameters for the recursive call.
- Hideously complex iterative algorithms for problems that can be solved recursively (eg: traversing a filesystem tree), especially where memory and performance is not a premium
- Recursive functions that check the same base condition both before and after the recursive call
- Recursive functions that don’t test for a base condition
- Recursive subroutines that concatenate/sum to a global variable or a carry-along output variable
- Apparent confusion about what to pass as the parameter in the recursive call, or recursive calls that pass the parameter unmodified
- Thinking that the number of iterations is going to be passed as a parameter
Get your feet wet and be prepared for some stack overflows. Begin by writing code with only one base-condition check and one recursive call that uses the same, unmodified parameter that was passed. Stop coding even if you have the feeling that it’s not enough, and run it anyway. It throws a stack-overflow exception, so now go back and pass a modified copy of the parameter in the recursive call. More stack overflows? Excessive output? Then do more code-and-run iterations, switching from tweaking your base-condition test to tweaking your recursive call until you start to intuit how the function is transforming its input. Resist the urge to use more than one base-condition test or recursive call unless you really Know What You’re Doing.
Your goal is to have the confidence to jump in, even if you don’t have a complete sense of “where you are” in the imaginary recursive path. Then when you need to write a function for a real project you’d begin by writing a unit test first, and proceeding with the same technique above.
- Writing IsNull() and IsNotNull(), or IsTrue(bool) and IsFalse(bool) functions
- Checking to see if a boolean-typed variable is something other than true or false
source : Software Engineering Tips