Exploitation vs Exploration in Software Development

Today I'm going to explore a question that's been rattling around in my head, "Why does software inevitably get worse over time?". Every software systems I've worked on inevitably decays and becomes difficult to work with, despite everyone's best intentions. Why is this? Well, I've come to believe that this is a systemic effect, resulting from the businesses hyper-focus on immediate profit, causing a mis-balance between exploitation and exploration.

Exploitation vs Exploration:

The exploitation vs exploration dilemma is a key factor in any decision that factors in limited information and our appetite for risk (“Exploit” in this context is the traditional meaning of the word, to take full advantage of a resource). Do we mine the known resource we have right in front of us? Or do we search for other resources that may or not exist? One has an immediate benefit; we get the resource; the other has a potential benefit; we might find a better resource but at the expense of no resource while exploring. At its core it's cost/benefit analysis. Is it worth exploring or is the risk too great? Should we play it safe or are we missing out on more? Could someone else take this resource while we're searching for something better? Or could someone else discover a better resource while we're exploiting this one?

Spectrum

To make things more complicated, resources have a shelf life. As time goes on the exploitable resource will diminish, nothing lasts forever after all, in which case the risk of exploiting said resource becomes higher than the risk of exploring for a new one. Figuring out when you've reached that tipping point is the trick.

This dilemma appears everywhere, as every decision ultimately lands somewhere on the spectrum between the two:

  • Entertainment: Watch an old movie or a new one?
  • Dinner: Cook your favourite meal or follow a new recipe that might be better?
  • Clothes: Stick with what's in the wardrobe or go shopping for something new?
  • Dating: Stay with my current partner or look for someone that's a better match?
  • Investing: Invest in a known stock or put it all into a meme coin?
  • AI: Choose the known safe answer or explore other potential answers?

These are extreme examples by the way, in reality it is a spectrum and there are many options with differing balances of exploit/explore. E.g. you could watch a new movie but it's part of a franchise you enjoy, so a little bit of story exploration with comfort character exploitation.

Or you could yolo it and explore something new by watching Paprika, because ultimately exploration is fun!

Exploitation is Productive, Exploration is Fun!

A key thing that I rarely see discussed in this topic is the emotional pull of both sides of the dilemma. Exploiting a known resource is productive and relatively stress free, you know what you're getting so that makes it safe. It is also boring. Exploration on the other hand is fun, a little exciting, and fulfilling. Exploration is how we learn and grow. Even when it doesn't have an immediate payoff the knowledge can still serve us moving forward.

This is actually why so many video games feature open world exploration. They weigh up the balance between exploitation and exploration, presenting safe exploitable fun through the sign-posted main quest, and then exploration based fun focussing on adventure and the discovery of new, interesting locations and usable resources. Breath of the Wild is great at this balance BTW.

Life isn't a game, and when money is on the line we need to get results, so "fun" isn't a good enough reason on its own to explore. Which leads to the question, when should we explore?

Local Maxima, Global Minima

Why explore when we have a perfectly good resource to exploit right here? Well the gap between exploit and explore is knowledge. We know what we have in front of us, but we don't know what other resources are out there. While what we have might look good, we might be missing out on even bigger gains and exploitation potential elsewhere. In-fact, what we have right now might be the smallest vein of gold in the valley. This is known as the local maxima vs global maxima problem.

Local Maxima Global Maxima

The local maxima is the highest point among nearby points. The global maxima is the high point across the entire domain. The less knowledge we have about a domain, the more likely it is that we're at a local maxima and not the global, meaning we're leaving a potentially better, more exploitable, resource out in the open for someone else to take. This is the balance between exploitation and exploration. We want to get the maximum return for our energy investment without missing out on better resources in our immediate vicinity. This is a tricky balance to reach in life and in business.

Business and Exploitation

In business a dollar today is better than a dollar tomorrow. A dollar today can be used immediately whereas a dollar tomorrow cannot. A promised dollar has more risk as it may not actually appear or it may lose value (deflation). This is why so many businesses actively get loans and go into debt. If the interest is low then it just makes sense to have the money now as it gives you more options.

From an ideological perspective this means that businesses, particularly mature ones, will lean towards exploitation over exploration.

How much they lean towards exploitation depends on their maturity. All businesses start off exploring more than they exploit, particularly if they're trying something new. There's risk in opening a new shop, and even more risk in a startup. Startups are exploring a new business opportunity and that exploration is costly. Their hope is that they can find and meet an unmet business need, i.e. discover and then exploit a new market/resource, a new maxima. Once the business matures and has captured their market they will naturally become more focussed on exploitation over exploration, playing it safe.

This is even truer for businesses that go public on the stock market. Shareholders want immediate returns, especially since many shareholders are actually speculators, so they put pressure on their executive teams to make more money from (i.e. exploit) the existing business and its resources. At this stage exploitation is the main driving ethos and everything is about making as much money from the resources (customers) as possible. The appetite for exploration will wane as only the safest short term bets will be prioritised.

What about the joy of exploration? Well while exploration is fun, we first need safety, (Maslov's hierarchy of needs) and a stable job is core to feeling safe and secure. In an exploit focussed company only safe bets will be prioritised and the fear of failure will drive most of the fun out of exploration, removing it's innate pull.

This is why internal startups and “intrapeneurs” rarely succeed, they are trying to explore in an environment that wants them to exploit safe and known resources. It doesn't matter how big the market of the new resource, the desire for short term gains will ultimately drive the new venture into the ground and the team members will be reallocated into something more immediately profitable.

A key thing to mention here is that no resource can be exploited indefinitely, and focussing entirely on exploitation can, and has, led to a company's downfall,

When Exploitation Backfires

The over emphasis on exploitation over exploration was the root cause of great losses for many companies. Kodak famously invented the digital camera but failed to capitalise on it as they thought it would eat into their currently exploitable film market. Digital cameras happened anyway, and instead of owning that market they watched as their safe exploitable resource slowly dwindled and disappeared. By playing it safe they ultimately missed on owning the new market and their business failed.

The same thing happened to Xerox, they invented the Graphic User Interface (GUI) but it didn't align with their safe exploitable business of expensive copiers, so they sat on it. Steve Jobs and Bill Gates "borrowed" the idea and ended up revolutionising home computing and operating systems, becoming billion dollar companies with market caps that Xerox can only dream off. This is what happens when you over rely on exploitation.

You know what is another resource that is easy to exploit once you have it? Code.

Software Quality and Exploitation

I am a software developer (a shocking twist in this article, I know) and I've been working in web development for 20yrs now (apparently, I don't look it). In that time I have come to understand the value of quality software. "Quality" in software is often considered a nebulous concept, one that we can't nail down. To me quality is all about design, does the design allow you to quickly and safely change it to meet the growing and changing needs of the business? If so, then it is high quality! Quality is domain specific, the more a domain changes the higher the need for quality. By this definition quality software is easy to understand, well documentented and low maintenance. Building a quality system requires investment, i.e, exploration.

You know how many high quality systems I've worked in? I can count the number on one hand, and even then that was just part of those systems. In my experience most software does not meet the above definition of quality, especially if it is highly profitable. Why is that? Surely companies want quality as that allows them to change? Well the systemic effects of exploitation over exploration explains this quality gap.

The Quality Gap

You see, design is difficult. You are trying to make something that solves a problem with limited knowledge of what the "best" solution to said problem looks like. Any profitable software is meeting a unique business need, and that means it's never existed before. If you know the best solution then you or someone else has already built it and there's limited value in rebuilding what already exists.

The result is that our initial system is going to be poor quality. It will work, sure, but it won't have incorporated all the learnings we made while building it. It's at this point that we have a choice, how much do we exploit what we have or explore new designs that could serve us better in future?

Quality Software Chart

Given it's a spectrum we have many choices after the initial MVP. The most extreme of which are pure exploitation or pure exploration.

Pure exploitation is a sprint focused on nothing but adding behaviour; no thought is given to long term growth. The initial cost of adding functionality is relatively low, which will quickly bring value to customers. However as time goes on the existing code starts getting in the way more and more, tech debt grows and the cost of adding new behaviour rapidly increases. Eventually the cost of a change starts to outweigh the benefit as bugs become endemic, preventing growth and causing the loss of existing customers, which leads to the company imploding.

Pure exploration is the complete opposite, it's a marathon with no destination. As soon as version one is released all energy is focussed on exploring new designs. Most of these designs will bear no fruit and will be thrown away. What we do have will be over engineered and will get in the way of swift change. A machine that can do anything is hard to operate after all. This coupled with initial slowness and a lack of focus on direct customer value will result in a company that is rapidly running out of runway. Sure we might be lucky and discover something insanely profitable, but even then someone else with a more exploit focussed mindset could swoop in and take it.

Balanced is the platonic ideal, with a focus on stability and long term growth achieved by iterating in a loop between exploit and explore. We exploit the current design of the system to bring customer value, then explore new designs that incorporate those changes, removing debt shortly after adding it. We never let the system get into a state where we're unable to change it and we never get lost in designs that bring little to no value. This gives us maximum agility and allows us to meet growing customer needs with limited staff overhead. This is the ideal, and I have yet to see a company like this.

Finally we have major exploitation, minor exploration. Our goal is profit, so we won't have time to incorporate all our learnings by redesigning or refactoring the system. Instead we bolt on changes to the initial design, iteratively adding more and more behaviour that was never envisioned or enabled by the said design. Once we notice that exploitation is slowing down we invest the bare minimum in exploration and stop as soon as we think we can start exploiting the system again. This is why the above line doesn't fully plateau, it's still going up, but very slowly.

This is the status quo of most professional software. The short term focus on exploitation, combined with the delay in feedback, leads to systems slowly getting worse over time. People are rewarded for going quickly, which incentivises shortcuts and adding more technical debt. It's becomes a positive feedback loop, making things worse over time.

In a company such as this you'll often see the following behaviours manifest, all of them the direct result of over exploiting the existing codebase:

  1. Quick fixes are the norm
  2. Fire fighting is common and fire fighters are rewarded
  3. Constant growth of dev team (need more devs to change the system)
  4. Attempts to improve things actually makes things worse (rewrites fail, old and new systems live side by side)

Bridging the Gap

Given we're following a profit centric ethos, what can we do to improve the quality of our system so that it meets our changing needs? How do we bridge the gap? It takes a serious effort to fix things in these cultures, but while the scale is tipped towards exploitation, these efforts will be deprioritied until all avenues of easy exploitation have been exhausted. At this stage a large scale refactor/rewrite is the only way forward, and since speed/exploitation has been rewarded, no-one in your dev org will actually have the skills to pull this off. After-all, why would they have these skills if there was never an incentive to explore and gain them?

This is a huge topic and not one I'm able to confidently talk about, so if you wanted a simple answer, sorry, you'll have to look elsewhere. If I were to offer a solution it's that we have to change our ethos and accept that a certain amount of exploration is required. We have to find the balance. Leadership have to be onboard with this and take a longer term outlook. Reward mechanisms must be shifted to incentivise exploration and quality. A longer term outlook and vision is required. Otherwise things can and will get worse over time.

System effects

The above is a bit simplisitic and glosses over the details of how the system manifest this behaviour. To really understand the flows and feedback loops in software development, to explain the behaviour, we have to dive deeper. This is clearly a huge topic and we're near the end of the article, so you've probably guessed where this is going . . . that's right, I'm going to write a followup article to this piece that will explore the systems of software development in a profit centric org!

In this followup I plan to explore:

  • What we do when we can't exploit anymore (e.g. hire more developers)
  • Recurring anti-patterns in software development
  • Levers on quality
  • Why refactoring and rewrites seem to fail

With that deeper understand we can looking at leverage points and how we can improve things.

If this sounds interesting to you, leave a comment below.

Tweet

Expert help

Have a codebase where change is expensive and risky?