Sunday, March 13, 2016

What is Code Quality?

There are many different definitions of code quality.  Some of the more recent definitions attempt to avoid any commitment to ideas such as loose-coupling, or adherence to known patterns in favor of a roll-your-own approach (you'd be surprised, or maybe not, how many software shops adhere to the not-invented-here syndrome).  These definitions only discuss code quality in how it relates to fulfilling the functional requirements.  Code quality is more than that, however.  It's about S.O.L.I.D. design principles, lowering static code analysis rules, and as importantly, communication with other developers about precisely what a method is or does.  I would also include unit tests as part of my overall definition.  On a weighted scale, I would say it's about 30% S.O.L.I.D., 30% static code analysis rules, and 30% established pattern reuse, and 10% communication with other developers.  I'll comment on each of these in turn.

Of the S.O.L.I.D. principles, I weight them differently as well.  Single-responsibility is by far the most important, followed by dependency inversion, and then interface segregation.  Open/closed principle and Liskov substitution I care less strongly about (unless I'm writing a framework, then I care quite a bit).  The reason single-responsibility is the first on my list, is because by following this principle, unit testable code immediately arises.  If I have a data-access party type of middle tier, for instance [link to Emergent Patterns], then my business objects aren't just my business objects, they're also my data access objects, and who knows what else.  Testing this type of scenario means I have to write expensive integration tests (in terms of developer time to find bugs).  If I have a clean division between the business and data access layers, then I can easily test my code with unit tests, and mock out everything else.  This is awesome, and I basically get it for free by following single-responsibility.  My other two favorites support test-ability as well.  Dependency inversion makes it way easier to do unit testing, especially if I segregate my interfaces to support specific clients (in this case, I mean specific clients to be classes that use implementations of those interfaces).

Why do I not favor Liskov and open/closed as much?  It's simple - they don't support testing as much.  Liskov deals strictly with object inheritance - which has more to do with swapping out code with other code which has already been tested (hopefully) at run time.  Open/closed is also about object inheritance, and suggests that objects should basically allow inheritance.  These two principles seem to contradict such practices as sealed classes (because they are closed for extension) and some simple things like adding logging to a subclass may be considered to violate Liskov (if you blur your eyes, and tilt your head).  In a word, they're 'squishy'.

With regard to static code analysis, the list of rules I favor is too long to mention here.  Suffice it to say that I believe experienced (read: senior+) software engineers should be able to agree on a list of rules which a particular software shop adheres to.  This is not the same as suggesting that rules should rise organically out of what makes it into functional code.  I suggest that healthy deliberation should ensue, and from this, a group of rules should arise which are then baked into the code base by modeling those rules with static code analysis tools (like Resharper or Visual Studio).  I acknowledge that most development shops will not go through the exercise of determining a set of rules which makes sense for their own application of software engineering.  That doesn't make it any less important for teams at least to come to some sort of coding standard which makes sense in their problem domain.

Patterns (such as exist in the Gang of Four) exist for a reason - use them.  Don't invent a custom pattern if one exists which already fulfills the need.  This is not the same thing as suggesting that every pattern you see in existing software is a good one - it must fulfill the need.  Ideally, the pattern originates from GoF or in a Security Patterns or Enterprise Patterns text, because those patterns have been distilled from an eternally growing list of patterns, and have been proven to work time and time again.  Emergent patterns originating from your own software organization are not as fortunate.  That said, if a pattern doesn't suit your needs, don't hack it up so that it does.  Some software engineering tasks don't readily lend themselves to patterns.  But...if you don't take the time to look, you won't know.  I keep a copy of GoF and Security Patterns by my desk for ease of reference.

Finally, communication with other developers is key in software engineering organizations, now many of which have hundreds if not thousands of engineers. By communication, I mean that I should not have to drill into a method to know what it does - it should be implicit in the name, and my assumption about what the name indicates should be generally correct.  I should not have to find all references of a variable to figure out what it is for (or depend on an IDE to do it for me  - it should be obvious and concise from the name).  I should not have to worry about side effects which are not called out readily in method comments...the list goes on and on.  A lot of this could be mitigated by the above paragraphs, but if not, then think of writing code as having a three-way conversation, one with the computer, and one with another developer.  Both of them should easily know what your code is meant to do without having to ask you or run it.  Robert C. Martin has a lot to say on this topic in his book Clean Code - which sits on my personal shelf as a reference.

So when I talk about poor code quality, I'm talking about code which does not consider the above very well.  When I talk about good code quality, I'm talking about code which is disciplined about this kind of 'stuff'.  I believe this is general enough to extend to casual conversations about code quality (i.e. that lot's of folks think along these lines when the squishy topic of 'code quality' arises), and I use it this way often in my blog posts.