Refactoring larger legacy codebases

Feb 28, 2019 · 4 min read · 1,980 views

This article is based on a comment of mine to a HackerNews question. Thought it might be useful to explain it here a bit more in detail.

Questions i am trying to answer here:

  • How can you be productive in larger codebases?
  • How can you improve legacy codebases?

The #1 rule of managing legacy codebases is “code that doesn’t get touched dies” - so you want to “touch up” important code as often as possible and get into a habit of small improvements.

Here a few thoughts about how to approach this…

1) Get the team on board

If multiple people are working on this codebase, you need their buy-in and support for whatever approaches you together choose to implement.

2) Plan for “health by a thousand small improvements.”

Legacy code was created by “death by thousand cuts”.

You won’t be to fix everything in one big step. You won’t be able to stop working and “refactor everything”. And more importantly, you shouldn’t do it. It never works out.

This refactoring will be an iterative approach over a longer time. Your goal is to refactor parts as you go.

3) Don’t assume different = bad

People who worked on this codebase might have done differently in the past. Those might be different than how you would do it. Invest in understanding their approaches and consider using them. Consistency beats beauty in codebases. Codebases get bad if multiple people try doing various different approaches. Don’t be one of them, whenever you can.

4) Create space

Consider introducing a Fix-it Friday.

The rule of Fix-it Friday is simple: unless your current project is on fire, use Fridays to invest in little improvements. Let engineers choose on their own what they work on. Try not to take the “fun” out of this by micromanaging. Some will try out new libraries. Some will remove bugs from the backlog. Both are fine. Try encouraging a balance of tasks.

Encourage these improvements by summarizing them (eg weekly in Slack) and maybe use them for quarterly peer reviews.

5) Create non-blame culture

Stuff will break if people risk improving things. Avoid shifting blame to them.

Those things can be subtle. Example: bug trackers might ping people individually, consider pinging the whole team instead.

6) Automate whatever you can automate

Introduce linters, auto-formating, code-mods, ci/cd tools, danger.js, code complexity analysis, etc. Use tools like (eg) lint-staged to encourage step by step improvements on new code.

7) Introduce tests

This one is the most annoying parts, but worth doing: whenever you improve a feature try adding a test. Do this step by step. Consider the tests a form of documentation explaining what the functionality should do.

A lot of people recommend writing a test suite for the whole app before you do anything. If you are lucky enough to do… well… this try it. I always found the iterative approach more realistic as you can also do feature work while refactoring. I also was usually never able to add tests unless i worked actively on a feature. But you are not me, try it maybe it works for you.

When doing tests focus on integration (vertical/functional/etc) and not unit tests (unless the “unit” contains critical or complex logic).

Your goal is to know “that you broke something”, it’s ok not to know exactly “what you broke”.

8) Acknowledge tech debt

Not everything needs refactoring.

If it’s not critical, or nobody needs to improve its functionality in the next months, or it’s just too complicated, consider acknowledging it as tech debt.

Add larger notes above the problematic areas and explain why you aren’t refactoring it, explain things that might help the next person understanding the code better, etc.

Whenever you leave comments, remember that comments should explain “why” not “what” the code does.

You should notice improvements quite quickly (a few months in).

Good luck!

If i can help somehow feel free to send me a DM via twitter!