Rocking the docs - about software documentation

Writing software documentation is probably about the most unpopular tasks for software developers. It’s time consuming and draining and worse, instead of compiling and seeing something cool happen on screen, you press save and nothing happens except your remaining disk spaces goes down slightly. It has to be done though – that perfectly clear understandable, well abstracted and common sense code that you commit today will somehow have become a mass of confusing, jumbled classes and nonsensical variable names in 6 months time when you go back in to fix a bug and at that point you’ll really appreciate the time you spend documenting it. Fortunately, there are a number of ways to document your code, and by choosing the right ones at the right time you can help ensure that your code stays understandable for months, or even years to come.

User stories

Now I can read in the dark. — Photo by Nong Vang on Unsplash

User stories come from agile and document the business requirements for the software in a clear, concise and understandable way. A common structure for them is ‘As a [type of actor] I want [feature] so that [reason]’. This is deceptively helpful as along with the actual feature requirement it lets us know who we’re targeting the feature at (is it someone technical, is it someone with special privileges, etc), and what they need the feature for which can be a huge help when working out the optimal way to satisfy the requirement. It’s very different writing a feature to delete a user because an administrator has banned them from the system, than because the user has exercised their right to have their personal data removed from your company’s server for example.

User stories continue to be useful after the software is released, letting us know which features were and weren’t implemented, the reasoning behind them. They can also be useful when new feature requests come in to see if (in the best case) the feature’s already been implemented, but perhaps not well publicised, or more likely a similar feature exists that can be modified. They can also help when users are complaining about some system behaviour to find out whether it was ever part of a requirement, or has slipped in by mistake.

Architecture diagrams

Orange reflective architecture — Photo by Alex wong on Unsplash

Where you are working with complex systems consisting of multiple services (or equivalent) it makes sense to have some kind of overarching documentation showing the interactions between those services. There are a number of tools that can be used to do this at varying price points and complexity but ultimately anything that lets you draw some kind of flow diagram is probably sufficient, up to an including a whiteboard or pen and paper (although if you go for these latter choices I’d recommend taking a photo and saving it somewhere common for ease of distribution).

The important thing with architecture diagrams is to keep it simple. The inner workings of each service or component aren’t relevant here – all we need to know is the names of each part, and how they interact with each other. Even the exact data being transferred can be determined through other means.

The big problem with architecture diagrams is that they tend to get forgotten about and go out of date. There’s nothing like the sinking feeling you get after realising that the service you’ve identified on the architecture digram as the probably cause of the problem doesn’t even exist any more! It can help to schedule regular reviews of your project’s documentation as well as adding a task to update the documentation on any user story that requires an architectural change, but what I think can really help is adding comments to the code implementing the services at the entry points to the code with a link back to the documentation. This reminds anyone reviewing or editing the code that the documentation exists and makes it easier for them to update it if necessary.

Code comments

By code comments, I’m referring to non-code documentation that exists in your code files. There are a number of formats this can take – header blocks at the top of each file, comment blocks on code elements and inline comments in amongst the code.

Header blocks

Header blocks are often used to document the license that applies to the code file, copyright information, the author of the code and sometimes the revision history. They can also be used to give a brief summary of the purpose of the code within the file although this is less common. While I don’t object to these details, and they may be required by your company, your source control platform or your legal jurisdiction, they provide little benefit to the developer and rarely tell us anything we can’t find out through other means. The author and revision history in particular are problematic as for any non-trivial project with more than one developer on the team these sections will tend to get very big very quickly, making it harder to find the actual code you’re interested in and all that information is in the source control repository anyway!

Comment blocks on code elements.

I may be biased, but I really like the .Net approach of writing XML based comments on any code element (classes, methods, properties, fields, etc). The .Net framework then uses the details for Intellisense in Visual Studio and also has tools to extract the comments into other forms of automatically generated API documentation. For this reason, I would always recommend using them on public classes and methods in .Net but why stop there? I use them for everything down to the smallest private method. I find it really useful when writing a method to try to write a one or two line description of what it’s doing in the comment block. If I can’t do that quickly then I probably need to re-think the method – either it’s doing too many things or it just doesn’t have a clear purpose. Some people find writing these comment blocks a chore and tools like GhostDoc have been developed to fill them out automatically but seriously, why bother? That isn’t adding any value and tricks the reader into believing they have useful documentation when they don’t! Take the time to step back and consider the purpose of your own code!

If you use a language that doesn’t have a special syntax for documenting code elements, you can probably use inline comments in a similar way to describe the purpose of your classes and methods and I would urge you to do so.

Inline comments

I’ve worked in some companies where the internal coding standards forbid inline comments and I can kind of see where they’re coming from. Overuse of inline comments can actually make the code harder to read, especially where they’re documenting the obvious (set foo variable to 3, invoke ‘callapi’ method, etc) and they can also encourage poorly written code if developers choose to use inline comments over actually writing readable code (no-one wants to find a ‘this is terrible, but trust me it works’ comment written by a long-departed member of staff on a misbehaving code block!). Despite this, I think inline comments can be useful if they are used sparingly. Best practice, as with most documentation, is to focus on the why rather than the what. Discovered that a third party library requires 2 initialisation methods to be called in a specific order? Inline comment! Got to include an apparently unrelated dependency because it’s the only way to get a specific bit of data your need? Inline comment! This way, we avoid bugs introduced by well meaning team members refactoring away the ‘mistake’, save time later if we need to do similar things elsewhere as we might not need to discover the ‘trick’ again, and just generally have more understandable code because the bit that makes us scratch our heads has its reasoning documented right next to it.

Unit tests

Math exam — Photo by Chris Liverani on Unsplash

Yes, as any test-driven-development advocate will tell you, your unit tests are part of your documentation. That’s why it’s so important to make sure that you’re writing good unit tests, that tell you what the code is meant to do, what it isn’t meant to do and what scenarios should lead to what result – it’s much more involved than calling the code in different ways until the code coverage meter hits 95%! You need to make sure that when you’re re-opening the code in 6 months time the tests will let you know if the undesired behaviour was an intended feature (at the time) or something that got missed. You also need to have confidence that any changes you make will only affect the scenario that you’re interested in.

This means that each unit test must have a clear, documented purpose. You’ll probably have heard that each unit test much only test one thing – note that this is different than saying it can only have one assert statement! It’s perfectly valid to have multiple assert statements so long as they all focus on the same thing (e.g. proving that all properties on a returned DTO are as expected for the scenario). Test methods should also be short enough that you can clearly understand what is being tested from looking at the code – that may mean making use of well named helper methods or even test helper objects to simplify the test setup, code invocation and even testing the output if required. The classes and methods should be well named and you should probably use a different naming convention for tests than the rest of your code. I prefer a more verbose style for tests with words separated by underscores for easier readability. If this doesn’t give you enough space to describe the purpose of the test it doesn’t hurt to use inline comments to add further detail.