Building Code To Last
(even though you should delete it ASAP)
Once your code is born, it’s tumbling towards legacy. Developers are licking their chops at the thought of refactoring or making your brand new, beautiful code obsolete, or simply ignoring its existence until it gets cleaned up like so many out of scope variables.
Does that mean it should be built with the integrity of a paper plate? No! Follow me and I will show you the way to keep writing good code without feeling the anxiety of a new parent watching their child begin to walk, freeing themselves to fall gasp.
How to show a codebase you care
You will be with your code for a while, and maybe others will need it to, so it should be a nice environment. It may be difficult to get others to be as nice to your codebase as you are, but it is nearly impossible for a newcomer to come into a messy codebase and expect their contribution to be nice. So, be nice to your code and it will (maybe) be nice to you! There are many ways to show your code that you care:
- tests/example usage
- documentation
- high-quality names
- hide the ugly bits
- bring the important bits to full view
Tests are the best and easiest to quantify way to care for your code. Regardless of the arguments against writing tests in certain cases, or against using coverage as a quality measure. A repository with no tests will always be harder add contributions, compared with one that has high test coverage. In my opinion 100% coverage is a good goal, but somewhat less is totally acceptable. Some functions are very difficult to test and have very low risk of error, and aren’t worth the trouble. So, don’t fret too much about 100 vs 98. What is highly valuable is to only lower the coverage threshold intentionally. This means set a test coverage threshold very close to the current value, if someone wants to commit changes that lower the coverage, you will see by how much they are lowering it and if it passes the value test, is it a low enough risk and a high enough cost? Without the threshold, low-risk code often makes it through without tests that would have taken just a minute or two to add. The lower the percentage, the more it can creep, until some untested code introduces an error. Tests also make good documentation for confusing code or requirements.
For many repositories tests are sufficient documentation of how to use the functions and objects defined within. Repositories that are intended to be reused by many can benefit greatly from example code, either documentation-style, or a repository of small projects demonstrating differing abilities and use-cases. This usually gets scrapped in a project because the value is seen to be for the user coming along to use it, which it is true, the user will get a lot of value out of it. What is often overlooked is that it provides immense value for the authors of the code. When you force yourself to use your own library (or framework, etc). You will inevitably find inefficiencies and opportunities for enhancement. Some of these will likely be very easy to correct on the fly and may save the pain of breaking changes later in the life of the code. Also, the death of a library is generally just because nobody is using it, and if it’s hard to get started using it, or not clear when it should be used, it won’t be used. In fact, just like test-driven development can be good for keeping your eye on the prize, it can be useful to start with a sample project, using the library as you wish it would work, even before it exists.
Documentation has been under attack since before I became a software engineer (“Programmer Analyst” when I started). I’ve often seen it dismissed with a nod to the Agile Manifesto which doesn’t even dismiss documentation, just sets the priority on working software. But documentation is valuable for the same reasons as the example usage mentioned as above. While the example usage gets most of the difficult work done, the documentation makes sure that your code is explainable, which is related to, but not the same as usable. The more explainable your code is, the easier it is to get people using it and contributing to it. I have witnessed intentionally confusing code, with the goal of inhibiting contribution, but I’ll leave that to the side as an anti-pattern. Where documentation differs the most from example code, is that it can convey what is important in the examples. For instance, maybe all of the example code has the same folder structure, documentation can explain that the folder structure is actually required, or, if not required, why it might be recommended. It can also explain how to contribute, build it locally, etc. The act of documenting these processes is very useful when you decide to streamline or automate, as well as having something to refer to when responding to pull requests that are straying from the norm.
If the names don’t represent the thing they are naming, they are actively distracting the reader from what they need to know. I once reviewed some code that had a function named sendEmail
. It’s pretty obvious what you would think it does, but what it actually did was inserted a new customer into a database. It was called sendEmail
because there was another process that would look for new customers in the table and send them an email. Now, what happens when you want to change how the email is sent? You may quickly find the sendEmail function, and that will be no help. What’s more, what if you want to change something about how new customers are added to the database?
We’ve all heard the advice about small functions and keeping the public API clean by making internals private. This is great advice, and the public API should receive the utmost care. The small functions directive can, at times, drive bad behavior. It is not analogous to the 100% test coverage goal, we should not strive for 1 line functions. We should strive for easily named functions with limited use of “and.” If you are tempted to put “and” in your function name, you either need a more abstract name, or two separate functions… most of the time. The goal here is to bring the important bits to the front and hide the dirty details. While the most important parts are the public API, you will likely have important internals as well, they are deserving of being highlighted in the service of clarity. Think about not only naming here, but also organization. Depending on the language, you will find better and worse ways of doing it. By doing this, you will often find more opportunities for re-use as well. As you find these re-use opportunities, don’t be precious with the function name, you may realize it was too specific or general of a name when you started, change it when you think of it!
Why did the engineer refactor the code?
because it was there
- it’s ok
- it was good enough to use
- change is required
So you’ve given your all and put your blood, sweat, and tears into this beautiful library. Then you get an email, “Pull request on bakerag1/perfect-lib opened,” how can this be? A well-meaning engineer has made changes to the code, and while you can recognize that you had missed a small nuance in the error handling, why has the method name changed?? Panic ensues, the heart rate increases, the mouth is dry, and water doesn’t help! it’s ok, take a breath, take five breaths. A pull request isn’t only an indicator that something needed changing, it’s also a sign that the codebase is worthy of change, its purpose is a useful one, and it is written well enough to be worth changing, instead of starting over. So, look again with fresh eyes, and if you have questions or concerns, comment with them, politely. Now you have a collaborator, this is helpful!
When you pushed your last commit and ceremoniously dusted your hands, it wasn’t as final as you imagined. It was already in need of amendment, imperfect is the only valid state of a piece of code, change is required. If you aren’t finding issues in your code, you aren’t looking in the right way. Now, code can certainly reach a good enough state, and it might be the right choice to stop your search for problems, but if it gets a lot of use, you can expect to hear about troubles. The best possible way to hear about a problem is in the form of a pull request. It should clearly identify the problem and the solution, and indirectly let’s you know someone is helping you care for the code.
Delete wherever possible
- worthless code muddies the water
- the last thing you can do for your codebase is to delete it
We’ve already covered that even the best code is never perfect, but at least it should be good enough to be saved from deletion!? Deletion isn’t the destiny of bad code, it’s the destiny of nearly all code, or at least it should bes. Code that doesn’t provide value is worse than worthless, it’s a problem, worthless code muddies the water. Obviously, unnecessary code increases execution time, but even worse is that it prevents developers from finding what’s important. One of my fondest memories in working with Java and Spring, was when I found that we could replace some boiler-plate code in hundreds of files with a little bit of XML in the context. I committed the deletion of thousands of lines of code, and bragged to everyone that would listen. (I guess I am still doing it!) So, if you see some code that feels like a waste, put in the work to delete it!
The looming next logical step in deleting unnecessary code, is to delete unnecessary repositories. This gets a little scary, are you sure nobody is using it? Are you sure you won’t need it later? When I was a junior developer I used to hear the acronym YAGNI (You Aren’t Gonna Need It) a lot, which came from the great XP - Extreme Programming movement. The point of the YAGNI principle was to avoid overbuilding a solution, for example, if it’s acceptable to create a connection on each request, don’t build a connection pool. The extension of the YAGNI principle is that if it was already built, but isn’t needed, you should delete it. The connection pool example may be a poor example for this, as it may well become necessary later, so you may not want to delete it. But let’s say you built the connection pool library for an internal, house-made system, that has been replaced. Do you need to hold on to the library… You Aren’t Gonna Need It. It was a piece of work you can be proud of, do you really want to watch it moulder on the shelf? The last thing you get to do for you codebase is delete it, just delete it, no one will notice, but they will appreciate that their code searches don’t find your 10 year old code that no longer does anything.
Why did I bother in the first place?
- it’s good practice
- it could have made it
- it provided the tools needed to become obsolete
You may be thinking, “great, I spent all this time and care making my precious library, then deleted it, why did I bother?” Well, first of all, you likely learned some things in the process, it’s uncommon to put in effort to building something without having to investigate aspects of the language or the problem domain, etc. It’s no surprise that the best way to learn something is to build something with it, that’s why there are tutorials for nearly every successful framework, library, language, etc. So, consider you unneeded project as a super-high quality tutorial, you don’t keep your tutorial code, do you? It was good practice, delete it and move on.
Also, you can’t predict the future, when you were coding, you thought this was going to be useful when you wrote it. It’s no big deal that you were wrong, people are wrong all the time. 20% of businesses fail in their first year, these failures are likely much more impactful than the code you wrote not gaining acceptance or proving useful. It can be hard to let go of your hard work, and it might be worth taking a second look at it. Maybe it just needs a clearer name, better public relations, a simpler interface. It can be difficult to determine if a project is dead, or if it just needs a boost. But, if you’ve determined it’s dead, get rid of it!
The best case scenario, is that this codebase was a launching pad, providing the necessary ingredients to its successor. The philosopher Wittgenstein referred to his predecessors as the ladder he and others climbed to reach new heights, but once you’ve reached the top, you must throw away the ladder and move on.1 Usually this sort of growth would simply mean a major version release, but there are cases where the preferred approach is to really start over with a new project, then will you need the old? You Aren’t Gonna Need It!
Saying Goodbye
In closing, I hope this has helped you to write code like it will be there forever, and delete code like it was never meant to exist. Juliet said to her star-crossed lover, “Parting is such sweet sorrow,” but whether you love your code or not, deleting it can just be sweet.
-
Ludwig Wittgenstein, Tractatus Logico-Philosophicus:6.54 ↩︎