Iterative Development and Abstractions

Matthew Tapps, Software Engineer

4 min read

If you've worked in a startup or built a product yourself, you will likely be familiar with this situation: A startup's focus is on building a working product that can be sold, and once it can be sold, the focus is on developing that product more so that it can be sold more.

That's definitely the right approach for a startup to take - if we're so focused on engineering things perfectly in our first iteration that we miss an opportunity, then it doesn't matter how perfect our solution is, we're all out of a job.

There will come a point, though, where engineers start to feel friction during continued development of the product. This is especially felt when new engineers join the team and need to get across the existing code that has been incrementally iterated upon - as an aside, this can be especially difficult depending on the current work load of the engineers who built the product, who are likely to have been elevated into positions with far more responsibilities than working on one specific section of a product (or worse, if those engineers have since left the company, taking their domain knowledge with them).

One cause for this friction is that when we are building a first product or proof of concept in a domain (especially if we aren't familiar with that domain, which we might not be as engineers brought in to make a product), we often don't know what the optimal design for the product's code base is - we just make decisions that work for now and get it working.

The product then expands and new functionality is built on top of the previous design decisions, and further design decisions are made that work but are potentially sub-optimal to account for new problems - this slowly builds up until it becomes an increasing chore to add new functionality to the product.

My recent thoughts on the solution to this issue is to find the right abstraction. The aforementioned sub-optimal design decisions that are made early on, or during iteration of the product, are likely to be very literal, with a straightforward implementation of the solution - there would be no abstraction at all at this point.

(There is obviously an alternative here - the wrong abstraction may have been chosen in an early design decision. My definition of "wrong abstraction" here could be something that you choose because it's a pattern you've seen elsewhere, or it will make it easier to address a nice feature that I'd like to build into the product eventually, or it lets you optimise something that doesn't need to be optimised yet. I would argue that this is worse than not choosing an abstraction at all, as in most cases it will increase the required work to refactor into the right abstraction.)

The right abstraction is something that will become clear once the product has been built and iterated on for some time, as the engineers build up their understanding of the domain and how it can be emulated within code. It might be necessary to ensure engineers have time set aside to consider these things as part of their normal development life cycle - if engineers are at 110% work load building new features, they won't have the bandwidth to think about what abstractions are beginning to naturally arise. One way to ensure the engineering team are budgeting enough of their time towards this as part of the development lifecycle is to use error budgets, where you set an SLO that determines the acceptable volume of errors for a given service - when the number of errors exceeds the budget, it’s a key indicator that the engineering team should prioritise platform maintenance tasks.

Once that right abstraction has become clear, the next step is to set aside engineering time to refactor the existing code and implement it.

Don't Choose The Wrong Abstraction: Starting A Project Right

I believe that when starting a project, over-engineering to try to do things perfectly from the start would often result in choosing the wrong abstraction, which would be compounded upon as new features are added. As mentioned above, I believe it is instead better to create as little abstractions as possible while first creating a new product in an unfamiliar domain, and the right abstractions will naturally reveal themselves over time.

So what can you do at the start of a project to set yourself up for success?

I start my new projects by describing the domain in data structures. For example, if I am developing an application that deals with cars, I will probably need a Vehicle data structure to easily deal with vehicles in my code. A Vehicle would then have attributes that I may be able to further define, like a Transmission Type that has one of two potential values, Manual and Automatic - so Transmission Type could be defined as needing to meet one of those two values. Another potential trait is Fuel Type, which could be one of Diesel, Petrol, or Electric. Here’s an example of how these data structures might be defined in TypeScript:

type TransmissionType = 'Manual' | 'Automatic';

type FuelType = 'Diesel' | 'Petrol' | 'Electric';

type Vehicle = {
transmissionType: TransmissionType,
fuelType: FuelType
};

After these basic data structures have been defined, it is easier to start writing code that uses domain-specific language, which increases the clarity of what your code is doing for all future readers: if we know a variable is of the type FuelType, we know that its value is ‘Diesel’, ‘Petrol’, or ‘Electric’, and now all of our code dealing with FuelTypes can be written based on these three possible values. Depending on the language you are writing in and your IDE, these domain language-based rules can be enforced while you are coding - using the TypeScript example above:

const fakeFuelType = 'LPG';
const realFuelType: FuelType = 'Diesel';

if (fakeFuelType === realFuelType) { // Error thrown by IDE because
realFuelType can never be 'LPG' according to the type definition of FuelType
... do stuff
};

...

const fakeFuelType: FuelType = 'LPG'; // This would also error straight away and is an example of why using type signatures everywhere can be a good practice, at least in TypeScript

Even if your language of choice doesn’t support compile-time type checking, defining your data structures (even just for yourself) will help inform the way you write your code and make it clear to readers what the expected values of variables are (and far more people will be reading your code than writing your code).

If we need a function that performs an operation based on the Fuel Type of a Vehicle, the data structure we defined informs the design of that function - we need to handle exactly three possible values, so our function will likely have three conditional cases. If we hadn’t defined the data structure, we might only begin to consider what possible values a Vehicle’s Fuel Type could be once we start writing that function, instead of having considered and refined what those values could be prior to needing that function.

What does a wrong abstraction look like?

Let's say my Vehicle program matches plain-text descriptions of vehicles to a specific Vehicle in my database. Vehicles have a number of traits that uniquely identify them, like their Transmission Type, and perhaps a Fuel Type. An example of a potential incorrect abstraction would be to write a generic function that can extract different traits from that plain-text description, using the same function, something like this:

const fuelType = extractVehicleTrait(description, 'fuelType');
const transmissionType =
extractVehicleTrait(description,'transmissionType');

At first glance, this doesn't seem like a poor abstraction: we're avoiding needing to repeat similar code to write two functions that extract the Fuel Type and Transmission Type separately. The downside is that this abstraction will now become foundational to our code - if we want to update the Transmission Type extraction functionality, we need to make sure that it doesn't affect our Fuel Type extraction functionality at the same time, which may be difficult when they are done within the same function.

Importantly, as the product is developed further, more code will start to rely on this function, making it increasingly difficult to refactor away from that design decision down the line if it is a wrong abstraction.

An alternative would be to avoid the abstraction in the initial stages of product design, writing both an `extractFuelType` and `extractTransmissionType` function. Then, once the product has been developed further, an abstraction may become logical - maybe after a few months of development we have noticed that we always extract the Fuel Type and Transmission Type traits in the same way, and because they have their own functions to handle trait extraction we end up doubling up on a lot of code in other places. Once the use case has shown itself, it becomes clear that combining those functions into one `extractTraits` function is the right abstraction.

Summary

Don't design an abstraction too early - there's a good chance it will be the wrong abstraction, and it'll be harder to update later than if you had chosen no abstraction at all
A bit of code repetition is okay when you start a project, especially when you're new to a domain
The right abstraction will likely show itself to you once you have worked within the domain and iterated on your product for long enough
Ensuring engineers have time to consider what abstractions might help their code and assess any areas of friction in a code base is a good practice for keeping a healthy code base
Time spent addressing tech debt indirectly increases sales by maintaining development velocity and improving product reliability, but knowing that doesn't make it easy to prioritise over feature work (error budgets may be useful here)

Blog