How software is made – a tour of the sausage factory

Bjørn Stærk

2010-08-01

Did your computer ever crash on you? That was my fault. Not literally, of course. You’re probably not a user of the software I’m working on. But it was the fault of someone very much like me – another programmer.

When it happened, you may have thought: This always happens. Why can’t they get it right? What kind of incompetent morons make software that doesn’t work? Well, it’s incompetent morons like myself, and now I’m going to explain how we do it. I want to explain it in a way that can be understood by non-programmers, or, as we programmers secretly refer to you: Those stupid users who crash our programs all the time. I want you to understand what software development is actually all about, what the challenges are, why it’s a difficult and even downright ugly process. Because it’s different from what you probably imagine.

What is code?

Software is made from code, which is a set of instructions to the computer. You know all that, and how it’s all 0’s and 1’s in the end. But the word “instruction” actually gives the wrong impression about how this works. If I give you an instruction to “go to the grocery store, buy some bread, and bring it back”, then you know what that means, and you’re able to do it. You may not want to do it, but all the words in that sentence make sense to you. You can’t give those kind of instructions to a computer. You have to explain what it means to “go” somewhere, what a “store” is and how to recognize the right one, how to buy something, and how to carry it back. In excruciating detail.

The reason the instruction “go to the store and buy some bread” makes sense to you is that, somewhere in your mind, there’s a detailed explanation of each of the concepts in that sentence: What it means to “go” somewhere, what a “store” is and how to recognize the right one, how to buy something, and how to carry it back. So when I tell you to go to the store, and I’m actually telling you to access those descriptions in your mind. This speeds up communication. I don’t have to say “move your right foot, then your left, and then ..” I can just say: “You know how to walk. You know what a store is. Go forth and buy!”

So maybe we can do that with software as well? Maybe one programmer can write some code that explains what “walking” is, another what a “store” is, etc, and then they can package it and make it available for other programmers. So instead of saying “move your right foot, then your left, and then ..” I can just say “You know how to walk. You know what a store is. Go forth and buy!”

And that’s precisely what we programmers do. We call this a library, (or a module, a component, a platform, etc, but let’s stick with library). Somebody makes a library that explains to the computer what walking is, and then other programmers use that library to give the computer high-level instructions like “Go over there”.

So that solves the problem, right? Software isn’t difficult at all. We just need lots of good libraries, and then you can tell the computer what to do, with clear, high-level instructions. We may not even need any programmers any more, because, with the right libraries, anyone can give the instruction “go to the store, buy some bread, bring it back”.

Levels of abstraction

But software is not like the physical world. When humans deal with the physical world, we normally do so only on a few levels of abstraction. Let’s think of two levels: A higher level and a lower level. Each action on the higher level is made up of lots and lots of actions on the lower level. For instance, at the lower level, you “move your right foot”, and then your left foot, and then your right foot, and so on. At the higher level, you “walk to the store”. At the lower level, you “press one finger down a bit”, and then lift it back up again. At the higher level, you “type the letters ‘h’, ‘e’, ‘l’, ‘l’, ‘o’ on a keyboard”.

Beneath the low level, there are even lower levels of abstraction. To move your foot, your body has to do many small tasks, which again consists of smaller tasks. And above the high level, there are even higher levels. Buying food is part of your overall strategy of staying alive. Typing letters is part of your overall plan to write an e-mail.

So there are more levels than two, in the physical world. But not many more levels. It would be possible but hard to think of 10 or 20 levels of abstraction in ordinary human life, and you would not feel emotionally attached to the highest and lowest levels. Every step you take up or down takes you further away from the things that matter to you on a daily basis.

Software is different because there are many more than 10 levels of abstraction. There’s no way of telling how many levels there are, and we make more of them all the time. Making software is not just about phrasing instructions, it’s about finding the right level to phrase them on, and building the levels you need in order to get there. We do this all the time. When you think of code, don’t think of a list of instructions, think of a tower of abstractions, each level stranger than the one below it.

Remember the difference between “move your foot” and “walk to the store”. Moving up a level of abstraction enables you to say things that do not even make sense on the lower level. And software is not limited by physical reality. It is, as Frederick Brooks put it, pure thoughtstuff. Nothing is fixed, everything is malleable – and you can always add more levels if you feel like it.

As a programmer, you don’t just tell the computer what to do, you define the concepts that allow you to tell the computer what to do. You build concepts upon concepts – and if there’s a mistake somewhere, the whole thing could fall down.

Mountain climbing

This makes it sound like programming is difficult, and it is. It also makes it sound like programmers are extremely intelligent. We’re not. Some are, but if software could only be written by geniuses, there wouldn’t be much of it.

That’s not to say that anyone can be a programmer. But what you need is not the kind of raw intelligence you need to be a theoretical physicist, but the ability to move safely up and down these malleable, tricky, untrustworthy levels of abstraction.

Programming is not usually about understanding complex formulas and concepts. It is about reducing complex concepts and formulas to something that is simple enough for programmers to understand. Any trick is permitted. A rock climber does not climb a steep mountain with their bare hands and feet. They use equipment. Programmers do the same. But we take it one step further: We look at the mountain, and then check if there’s a road somewhere, so that we can drive to the top instead of climbing. Or maybe we could take a helicopter.

When an application crashes, the immediate cause is that a programmer said “go left” when they should have said “go right”. But the ultimate cause may be that a programmer was trying to climb a mountain when they could have used a car, or that they shouldn’t have been trying to get up that mountain in the first place.

They may not even know that there is such a thing as a car yet, because the world of software moves through the equivalent of an industrial revolution every couple of years. Stop paying attention for a while, and you’re left behind in a previous century. Maybe someone just invented a way to blow up the mountain. Maybe mountains are obsolete, it’s all about space elevators these days. And this poor programmer is stuck half-way up that cliff, trying to decide if they should go left or right.

That’s what makes software difficult to make.

Planning the unplannable

Software is rarely made by individuals. It’s made by organizations and groups, and, unless the programmers kindly donate their evenings to an open source project, they usually ask to be paid for making it. This causes all sorts of problems. Before they’re willing to pay me for writing code, my employer needs to have an idea of how to earn back that money later, with profit. This means that I, and the 5 or 50 or 500 other programmers working on this code need to have a plan for what to make, and when it will be done.

The many levels of abstraction in software do not just make it hard to write. They also make it hard to make plans for your software project. At the time when you start a new software project, it is actually not possible to know both what that software is going to be like in the end, and when it will be done.

It can’t be done. Not both at the same time. The reason is that every software project should do something new and experimental – otherwise it’s not worth doing. All the safe tasks in the project, the ones that are clearly understood, will most likely be solved by libraries.

Think of the first bridge that was ever made. Making the first bridge was a difficult, experimental task. The first attempt probably failed, and the second, and third. Then, gradually, engineers began to understand what it is that makes bridges stable. Today it’s understood very well. Engineers are still trying to push the boundaries for what a safe bridge can be, but the basic ideas have been hardened by millennia of experience.

In the physical world, you learn how to build bridges well, and then you keep building those same bridges. One at a time. All over the world. In the software world, you don’t build the 1 000 000th bridge. You pay the engineers of an existing bridge to get a copy of theirs, and then you move up an abstraction level to something more difficult, perhaps to assemble a million bridges into a weird superbridge structure that only makes sense in a computer program. Once solved, that problem too is packaged as a library, and programmers move up to the next frontier, the next level of abstraction.

This means that programmers do the equivalent of inventing the first bridge all the time. They have to, because if it had already been invented, and fully understood, it would be a waste of resources to reinvent it yourself.

**Falling water

So think back to that first bridge ever made. Imagine you were the manager of the team that designed and built it. Before you can get started, you need to find some investors who are willing to finance the project. Perhaps you have a vision: “There’s some water in the way here, and we intend to make some kind of road that gets people from one side to the other.” But the investors want more than a vision. They want to know what they’ll get, and when they’ll get it. Will it be safe? Will it take people only, or horses as well? How many can cross it at a time? What happens if a boat comes along? What will it cost? Is it worth the cost? Will it be done in a year? Do you even know that it is possible to do this?

Would you be able to answer these questions, for the very first bridge in history? Would you be able to tell them what the bridge would be like, and when it would be done? Probably not. But you’d have to. Otherwise you won’t get the money you need. “A road over water?! Bah, nonsense!”

This is the conundrum at the heart of software development. It’s the root cause of all late projects, of all disappointed customers, and all software crashes. You can’t make plans, but you have to.

All improvement within the software industry follows two paths. The first is the one I described above: How to make ever more powerful libraries, at ever higher levels of abstraction. The second is how to plan the unplannable. This process is going on right now, and there is no end in sight. No place where we can rest and say, “yes, now we know how to make software”.

In the 2000’s, the trend in project management has been to solve the planning problem by doing less of it. Earlier, it was thought that since it was so difficult to make plans for software projects, we would just have to try really really hard to make good plans. This was called the “waterfall” model. You first spent a lot of time making a detailed plan for the project, and then the water flowed down a level, to the programmers, who spent a lot of time implementing the plan, and then the water flowed down to the testers, who verified that the software was stable enough to use.

Waterfall projects tended to spiral out of control. What would happen was this: The planners would spend lots of time making a plan, but the plan didn’t take into account something important. So when the programmers made the software, the software was flawed. The testers discovered that the software was flawed, and pointed at the programmers: “The programmers gave us flawed software!” The programmers pointed to the planners: “They gave us a flawed plan!” And everyone was angry.

Eventually they got the software working, but when the next project came along, the testers pointed at the programmers and said, “You’d better spend some more time programming this time, to make sure the software is correct”, and the programmers pointed to the planners and said, “You’d better spend some more time planning this time, to make sure the plan is sufficient.”

So the next project took longer, because they spent more time planning, and more time programming – and when the software came to the testers, it was still flawed. And again, everyone said, “we’ll just have to spend even more time next time.” But there was no next time, because the customers were fed up with their late deliveries and buggy software, and the company went broke. Or maybe they didn’t, but their software had spent so much time in development that it had become outdated before it was even released.

Agility

Why did this happen? Because it’s not possible to plan ahead what the software will be like, and when it will be done. Spending lots of time on planning didn’t make it any easier, it just made the project more expensive and time-consuming. Waterfall was an illusion: We’ll pretend to know the future, and if we fail we’ll pretend even harder next time.

The new idea of the 2000’s was agility. “Agile” is an umbrella for lots of ideas about software development, but what it’s essentially about is that instead of having three big phases – planning, programming, testing – you have lots of small phases that include all three at once.

The idea is that you set out in a direction, and you know roughly where you want to go, but not exactly how and when you’ll get there. The further along you get, the more you learn about the project, and eventually you’re able to make good predictions about what you’ll deliver, and when. Sometimes the “what” is given, so you adjust the “when”, or sometimes the “when” is given, so you adjust the “what”. But you never promise both the “what” and the “when” at the start of the project.

It’s like going on a vacation. Imagine you’re going to a country you’ve never been to before. A “waterfall” tourist will make a detailed plan for every hour of every single day of their trip: When to travel, and how, and what to do at each stage. “On day 3 there’s a bus leaving at 11:24, which will get us to city X in time for dinner at this restaurant the guide book recommends.” But the plane was delayed, the hotel was unexpectedly closed, the bus didn’t go on Saturdays, they got food poisoning, and for each part of the plan that failed, the rest of the plan became less and less realistic.

An “agile” tourist will say: “We’ll arrive on day 1, and leave on day 7, and before we leave we’d like to visit X and Y, but we’ll just have to see when we get there.” Instead of bringing a detailed plan, they bring flexibility. They expect to be surprised. They prepare by making sure bad surprises won’t affect them, and that they’ll be able to take advantage of the good surprises. Perhaps when you get there you realize that Y is a real dump, and you should actually go visit Z. An agile tourist can make those kind of spontaneous decisions. A waterfall tourist can’t – not without another round of planning.

As a tourist, you can easily drop the “waterfall” plan and become “agile” when you realize that the plan is a bad plan. A software project typically can’t do that, because there are contracts and market expectations involved. You are required to be on that bus at 11:24. But you’re not, so now everyone is angry at you. And what that means in the end is that the user gets stuck with software that came too late and doesn’t work.

Of course, that doesn’t solve all problems either. No investors or customers like to hear that you can’t promise both what they’ll get and when they’ll get it. So this approach only works when there is a lot of trust involved. The party who takes the risk needs to trust that the programmers know what they’re doing and that they mean well. That’s not always possible. Then you just have to guess, and really really hope that you’re not too much off target. That works too. Sometimes.

Are programmers engineers?

There are many such ideas under the agile umbrella. Another is that instead of waiting to the end of the project before you test the software, you teach the program to test itself, so you can run the tests continously throughout the project. So when a programmer adds or fixes a bit of code, they also add code that tests that the code works.

This doesn’t make the software flawless, because it’s really difficult for software to test all of its own functionality, but it does prevent the most embarassing mistakes.

The journey doesn’t end with agile, no more than it ended with the many good ideas that came before it. Software development is an immature discipline. We’re not even sure what kind of discipline it is. Is it science, engineering or a craft?

Naive users expect us to be scientists. In old science fiction movies, computers represented the rational. Even when they failed, they failed because of their rationality. Think of the computer that blows up because it encounters a paradox, or the malevolent rationality of HAL.

In this view, computers represented the furthest reach of science, with all the promises and horrors that entailed.

Then we started actually using computers, and we realized that while computer hardware involves science, computer software doesn’t. Making a spreadsheet application or a game clearly isn’t anything at all like theoretical physics. But it could be a bit like engineering.

We even put it in our job titles: Software engineers. Think of an “engineer”. What do you see? Perhaps a serious-minded, cautious builder. Someone who knows that the building they erect will stand safely, because they have the methodology and training that allow them to know these things. You see someone who doesn’t guess, who doesn’t take dangerous chances.

Now think back to your latest software crash. Could engineers really have made that application? The houses we live and work in, the infrastructure we rely on, almost never falls apart unexpectedly. But the software we use does that every day. So the “engineering” label is at best a goal to strive for, at worst it’s fraud.

From apprentice to .. slightly more skilled apprentice

That leaves us with programming as a craft, a practical profession that relies on experience, skill and intuition. This fits better with experience. It implies that mastery of programming cannot be taught fully in school, only acquired through years of practice. It also implies that the best way to learn programming is to work with someone who is better than you, so you can assimilate their habits and intuitions.

It also implies that there are many bad programmers, who have the training and the job title, but don’t have the experience or skill to do their job well. And that if you have a project you need to complete quickly, you can’t just say, “get me 150 programmers, so we can get this done”. With programmers, quantity is less relevant than quality, because you can do more with 15 good programmers than with 150 bad ones.

Even the craft label is a bit of a fraud, because a craft requires a certain mindset that the software industry often doesn’t have. Managers actually do say things like, “get me 150 programmers, so we can get this done”. And then it fails, and they say, “well, get me 300 programmers next time”.

But at least programmers can realistically aspire to be craftspeople. And then, one day far in the future, we have perhaps advanced enough as a profession that we can call ourselves engineers.

And then your computer won’t crash any more. I honestly don’t know if this is possible, or if it will happen some day after the Second Coming of Alan Turing. Until then, I would like to say, on behalf of all of us: Oops! I’m really sorry about that bug!

[To post comments, go here.]