Why was this written?
Most of these faults were discovered the hard way by the author himself, either because he committed them himself or saw them in the work of others.
This paper is not meant for grading programmers, it was intended to be read by programmers who trust their ability to judge when something is a sign of bad practice, and when it’s a consequence of special circumstances.
This paper was written to force its author to think, and published because he thinks you lot would probably get a kick out of it, too.
1. Inability to think in sets
Transitioning from imperative programming to functional and declarative programming will immediately require you to think about operating on sets of data as your primitive, not scalar values. The transition is required whenever you use SQL with a relational database (and not as an object store), whenever you design programs that will scale linearly with multiple processors, and whenever you write code that has to execute on a SIMD-capable chip (such as modern graphics cards and video game consoles).
The following count only when they’re seen on a platform with Declarative or Functional programming features that the programmer should be aware of.
- Performing atomic operations on the elements of a collection within a for or foreach loop
- Writing Map or Reduce functions that contain their own loop for iterating through the dataset
- Fetching large datasets from the server and computing sums on the client, instead of using aggregate functions in the query
- Functions acting on elements in a collection that begin by performing a new database query to fetch a related record
- Writing business-logic functions with tragically compromising side-effects, such as updating a user interface or performing file I/O
- Entity classes that open their own database connections or file handles and keep them open for the lifespan of each object
Funny enough, visualizing a card dealer cutting a deck of cards and interleaving the two stacks together by flipping through them with his thumbs can jolt the mind into thinking about sets and how you can operate on them in bulk. Other stimulating visualizations are:
- freeway traffic passing through an array of toll booths (parallel processing)
- springs joining to form streams joining to form creeks joining to form rivers (parallel reduce/aggregate functions)
- a newspaper printing press (coroutines, pipelines)
- the zipper tag on a jacket pulling the zipper teeth together (simple joins)
- transfer RNA picking up amino acids and joining messenger RNA within a ribosome to become a protein (multi-stage function-driven joins, see animation)
- the above happening simultaneously in billions of cells in an orange tree to convert air, water and sunlight into orange juice (Map/Reduce on large distributed clusters)
If you are writing a program that works with collections, think about all the supplemental data and records that your functions need to work on each element and use Map functions to join them together in pairs before you have your Reduce function applied to each pair.
2. Lack of critical thinking
Unless you criticize your own ideas and look for flaws in your own thinking, you will miss problems that can be fixed before you even start coding. If you also fail to criticize your own code once written, you will only learn at the vastly slower pace of trial and error. This problem originates in both lazy thinking and egocentric thinking, so its symptoms seem to come from two different directions.
- Homebrew “Business Rule Engines”
- Fat static utility classes, or multi-disciplinary libraries with only one namespace
- Conglomerate applications, or attaching unrelated features to an existing application to avoid the overhead of starting a new project
- Architectures that have begun to require epicycles
- Adding columns to tables for tangential data (eg: putting a “# cars owned” column on your address-book table)
- Inconsistent naming conventions
- “Man with a hammer” mentality, or changing the definitions of problems so they can all be solved with one particular technology
- Programs that dwarf the complexity of the problem they solve
- Pathologically and redundantly defensive programming (“Enterprisey code”)
- Re-inventing LISP in XML
Start with a book like Critical Thinking by Paul and Elder, work on controlling your ego, and practice resisting the urge to defend yourself as you submit your ideas to friends and colleagues for criticism.
Once you get used to other people examining your ideas, start examining your own ideas yourself and practice imagining the consequences of them. In addition, you also need to develop a sense of proportion (to have a feel for how much design is appropriate for the size of the problem), a habit of fact-checking assumptions (so you don’t overestimate the size of the problem), and a healthy attitude towards failure (even Isaac Newton was wrong about gravity, but we still love him and needed him to try anyway).
Finally, you must have discipline. Being aware of flaws in your plan will not make you more productive unless you can muster the willpower to correct and rebuild what you’re working on.
3. Pinball Programming
When you tilt the board just right, pull back the pin to just the right distance, and hit the flipper buttons in the right sequence, then the program runs flawlessly with the flow of execution bouncing off conditionals and careening unchecked toward the next state transition.
- One Try-Catch block wrapping the entire body of Main() and resetting the program in the Catch clause (the pinball gutter)
- Using strings/integers for values that have (or could be given) more appropriate wrapper types in a strongly-typed language
- Packing complex data into delimited strings and parsing it out in every function that uses it
- Failing to use assertions or method contracts on functions that take ambiguous input
- The use of Sleep() to wait for another thread to finish its task
- Switch statements on non-enumerated values that don’t have an “Otherwise” clause
- Using Automethods or Reflection to invoke methods that are named in unqualified user input
- Setting global variables in functions as a way to return multiple values
- Classes with one method and a couple of fields, where you have to set the fields as the way of passing parameters to the method
- Multi-row database updates without a transaction
- Hail-Mary passes (eg: trying to restore the state of a database without a transaction and ROLLBACK)
Imagine your program’s input is water. It’s going to fall through every crack and fill every pocket, so you need to think about what the consequences are when it flows somewhere other than where you’ve explicitly built something to catch it.
You will need to make yourself familiar with the mechanisms on your platform that help make programs robust and ductile. There are three basic kinds:
- those which stop the program before any damage is done when something unexpected happens, then helps you identify what went wrong (type systems, assertions, exceptions, etc.),
- those which direct program flow to whatever code best handles the contingency (try-catch blocks, multiple dispatch, event driven programming, etc.),
- those which pause the thread until all your ducks are in a row (WaitUntil commands, mutexes and semaphores, SyncLocks, etc.)
There is also a fourth, Unit Testing, which you use at design time.
Using these ought to become second nature to you, like putting commas and periods in sentences. To get there, go through the above mechanisms (the ones in parenthesis) one at a time and refactor an old program to use them wherever you can cram them, even if it doesn’t turn out to be appropriate (especially when they don’t seem appropriate, so you also begin to understand why).
4. Unfamiliar with the principles of security
If the following symptoms weren’t so dangerous they’d be little more than an issue of fit-n-finish for most programs, meaning they don’t make you a bad programmer, just a programmer who shouldn’t work on network programs or secure systems until he’s done a bit of homework.
- Storing exploitable information (names, card numbers, passwords, etc.) in plaintext
- Storing exploitable information with ineffective encryption (symmetric ciphers with the password compiled into the program; trivial passwords; any “decoder-ring”, homebrew, proprietary or unproven ciphers)
- Programs or installations that don’t limit their privileges before accepting network connections or interpreting input from untrusted sources
- Not performing bounds checking or input validation, especially when using unmanaged languages
- Constructing SQL queries by string concatenation with unvalidated or unescaped input
- Invoking programs named by user input
- Code that tries to prevent an exploit from working by searching for the exploit’s signature
- Credit card numbers or passwords that are stored in an unsalted hash
The following only covers basic principles, but they’ll avoid most of the egregious errors that can compromise an entire system. For any system that handles or stores information of value to you or its users, or that controls a valuable resource, always have a security professional review the design and implementation.
Begin by auditing your programs for code that stores input in an array or other kind of allocated memory and make sure it checks that the size of the input doesn’t exceed the memory allocated for storing it. No other class of bug has caused more exploitable security holes than the buffer overflow, and to such an extent that you should seriously consider a memory-managed language when writing network programs, or anywhere security is a priority.
Next, audit for database queries that concatenate unmodified input into the body of a SQL query and switch to using parameterized queries if the platform supports it, or filter/escape all input if not. This is to prevent SQL-injection attacks.
After you’ve de-fanged the two most infamous classes of security bug you should continue thinking about all program input as completely untrustworthy and potentially malicious. It’s important to define your program’s acceptable input in the form of working validation code, and your program should reject input unless it passes validation so that you can fix exploitable holes by fixing the validation and making it more specific, rather than scanning for the signatures of known exploits.
Going further, you should always think about what operations your program needs to perform and the privileges it’ll need from the host to do them before you even begin designing it, because this is the best opportunity to figure out how to write the program to use the fewest privileges possible. The principle behind this is to limit the damage that could be caused to the rest of the system if an exploitable bug was found in your code. In other words: after you’ve learned not to trust your input you should also learn not to trust your own programs.
The last you should learn are the basics of encryption, beginning with Kerckhoff’s principle. It can be expressed as “the security should be in the key”, and there are a couple of interesting points to derive from it.
The first is that you should never trust a cipher or other crypto primitive unless it is published openly and has been analyzed and tested extensively by the greater security community. There is no security in obscurity, proprietary, or newness, as far as cryptography goes. Even implementations of trusted crypto primitives can have flaws, so avoid implementations you aren’t sure have been thoroughly reviewed (including your own). All new cryptosystems enter a pipeline of scrutiny that can be a decade long or more, and you want to limit yourself to the ones that come out of the end with all their known faults fixed.
The second is that if the key is weak, or stored improperly, then it’s as bad as having no encryption at all. If your program needs to encrypt data, but not decrypt it, or decrypt only on rare occasions, then consider giving it only the public key of an asymmetric cipher key pair and making the decryption stage run separately with the private key secured with a good passphrase that the user must enter each time.
The more is at stake, then the more homework you need to do and the more thought you must put into the design phase of the program, all because security is the one feature that dozens, sometimes millions of uninvited people will try to break after your program has been deployed.
The vast majority of security failures traceable to code have been due to silly mistakes, most of which can be avoided by screening input, using resources conservatively, using common sense, and writing code no faster than you can think and reason about it.
5. Code is a mess
- Doesn’t follow a consistent naming convention
- Doesn’t use indentation, or uses inconsistent indentation
- Doesn’t make use of whitespace elsewhere, such as between methods (or expressions, see “ANDY=NO“)
- Large chunks of code are left commented-out
Programmers in a hurry (or The Zone) commit all these crimes and come back to clean it up later, but a bad programmer is just sloppy. Sometimes it helps to use an IDE that can fix indentation and whitespace (“pretty print”) with a shortcut key, but I’ve seen programmers who can even bludgeon Visual Studio’s insistence on proper indentation by messing around with the code too much.
source : Software Engineering Tips