Laws of Software Development
by Jon Foster
I was one of the rare kids that grew up with a computer in my bedroom starting in the mid 1970s. Today some one reading this would likely want to argue that almost all kids have computers. But back then it was rare. IBM did not release their version of a PC until 1981 and computing hardware cost thousands of dollars. Think: do I buy a car or a computer? I still consider myself blessed for having grown up along with the personal computing industry. I've had a passion for writing software ever since.
In these intervening years I have discovered several inescapable realities with software development. And although they seem obvious to me, it's also obvious that most of the current crop of developers aren't aware of them. So I will attempt to write down these observations and explanations on this page. Since many of these laws are second nature to me, I will likely be adding to this list as they percolate to the forefront of my brain.
- A program is never better than the platform it is on.
In short our applications inherit the issues the target platform has. This includes hardware, OS and supplied libraries. I worked for an outfit that wrote a database application with FoxPro. On our test and clients' networks running on a Netware network we never had trouble with the application, freezing, losing or corrupting data. We had one client that was running a windows server/network and they were getting random occurrences of all those issues. Were we dealing with a hardware or software issue? Our outfit wasn't using a windows server but if windows shared files correctly (as advertised) then it should work. Nothing Netware specific was going on. Needless to say this complicated our relationship with that client.
Well a couple of years later I had left that outfit and was working as the head of IT and software development at another firm. We were using a canned order-entry application to process orders for our products. It had been written in assembly language, but like the FoxPro app above we were getting random freezes. But this situation was worse! Once one workstation locked up invariably it would cascade and lock up all 50 of the other workstations! Imagine 50 operators taking over 6,000 orders a day having their computers freeze all at the same time! The pressure was on. But I knew not to call the apps tech-support. Would your customers?
Turns out the issue was bugs in both the windows MS network clients and the windows server. The Netware client and server didn't have the issue. The problem was that windows server and the associated client software can't handle a shared file database system like FoxPro. The auto-disconnect feature and lock handling are broken. Interestingly enough I tested server 2008 a while back and it still had the issues. But then why fix it when you make more money selling SQL server?
There are other more direct ways the platform affects an app. I'm sure any seasoned developer can mention a time or two where an API failed to function as described and they had to come up with a work around. Some things you can do that with and others not so much. I try to hide the issues but the platform still affects the outcome. Its not possible to have it any other way since Its being asked to do work for us.
- A program is never better than the tools used to write it.
This is very similar to the above but frequently of greater impact and we have a greater ability to control this. Our applications inherit issues from the compilers, interpreters, run-time libraries, add on libraries or frameworks that we choose to use. As much as I'd love to dive into a few Microsoft examples here I think I'll pick on an OpenSource project instead. I've watched the Lazarus project for over a decade now and I have used it in the past, but I have given up on it due to the complete lack of attention to actually making things work.
As a first example I wanted a simple applet to display a calendar for a month. I didn't want it to store any appointments or attach any kind of data to it. I just wanted a calendar display as simple as a wall calendar. I wanted to scroll through months and years. Much lik using the *nix "cal" command to get a calendar in a terminal. It worked. The resulting app was over 1MB. With the functionality being provide by the GTK library I was a little confused as to why my code was so big. I really expected 2 or 3 hundred K. Well... using the same compiler Lazarus uses I wrote the appropriate GTK calls and compiled it. 40K! Now that's more like it.
But it has lots of bugs too. Any of my apps will randomly freeze when a dialog window closes because for some reason it hasn't realized it closed. So all the rest of the UI is locked until its killed. Or from a different framework random sections of a window getting erased. And from the GTK library a button that can't be repeatedly clicked unless you move the mouse between clicks. Think of the "next" button browsing data or in a search-and-replace box. Click, wiggle mouse, click, ... It used to be worse. Previously I'd have to move the mouse completely off the button and then back over it. I don't want these issues in my app.
I've not even scratched the surface but the bottom line is that since the development tools we use are handling our requests and instructions for us, our app is a direct product of those things. Our apps will misbehave as they do.
- Poor software quality is a security vulnerability.
OK, not every bug is a security vulnerability. But those who are careless with their coding practices and testing procedures are more likely to generate bugs, and a percentage of those will have security implications. In this day and age we have to pay much closer attention to the details than was needed in the past. Simply because there are many out there looking to leverage our mistakes.
- Interpreters are slow!!
YES! THEY ARE! I was shouting at all those who would like to argue the point. They are slower because they are software that translates and runs other software. All you have to do is sit down to write an interpreter and its painfully obvious that for every command in your programming language many more have to be run to make it happen. One author of an interpreted language said that they run 100x to 1000x slower, meaning it will take that much longer to accomplish the task. Now days it seems many would like to write interpreters in interpreters... that's just painful: 1000*1000x longer?!?!
When it comes right down to it all languages that we use are human representations of what we want done and they have to be converted to machine code, the native language of the CPU. "Native code" compilers and assemblers do this once and then the finished machine code is saved and executed when wanted. An interpreted language may get compiled (I prefer "tokenized") into "byte code", but then there is a program (more machine instructions) that read the byte code, decide what should be done, call the appropriate routines, ... over and over again. This can be fudged quite a lot by "JIT" compilation, which hopefully converts the byte-code into machine code, so that you get most of the benefit of native-code execution. But then most interpreted languages aren't structured well for native execution so there is usually additional work done to support the dynamic nature of interpreted languages. And "JIT" has to happen at some point, usually with every launch... which takes time. Compile, compile, compile again...
- Programming languages are not created equal.
And the biggest differentiator is #4 above. If you're expecting or hoping to do something high-volume it will save a lot in hardware costs and other headaches if its developed with a language that compiles to native-code. The 100x+ performance boost translates to less machines, cores, ... to run the software and perform the same workload. But there are many other things that can improve or hamper results. Something like Java's lack of pointers can be considered a security benefit by some, but it is a hindrance to many common tasks, like event handlers in GUIs. Although the syntax for anonymous classes hides the complexity from the developer the CPU still has to do the work. And in the end your still passing pointers, its just a way more complicated one. The hope is that Java has fully vetted this pointer.
That leads us to garbage collection: There are pros and cons to this as well. Its considered more secure since the developer is relieved of the duty of cleaning up program constructs and it should eliminate classes of programming flaws, like use-after-free bugs. But performance suffers for the extra book keeping that has to be done and it typically leads to memory leaks due to circular references. The more the garbage collector tries to get around those the slower things get.
The worst case I think that I have ever seen of garbage collection tar was an early Microsoft Basic compiler for the IBM PC (the original one). This really was the one piece of software I thought Microsoft could do well. I was wrong. A simple loop would be really fast for a few seconds and then it would get slower and slower and later moving on to extended stalls, all within a few minutes. The end result was that their interpreters ran better, at least after the first minute. I've seen similar affects in some Android apps and think to myself, "Yup... the trash men are busy again!"
This is perfect example of rule #2. So like most everything else choosing the right tool for a job can make a big difference. Personally I like to have one good general purpose interpreted language and native-code-compiled language available for tasks. The former is usually quicker to get simple tasks done with, the source code is readable making it good for management tasks where you want transparency for other admins. Then use native-code compiled languages for performance sensitive tasks like processing millions of web log lines or determining the probability an email is spam.
- Performance is a habit not just a future task.
It starts with choice of development tools, frameworks and external services, like a database. Then it continues with knowing the intricacies of your language and tools. Many times there are language constructs that can do the same job but one may be faster than the other. Things like: "for" vs "while", "for" vs "foreach" in PHP used for iterating arrays, repeatedly accessing an object property or caching it in a local variable, x/2 vs. x>>1, ...
For every language there is a list of these constructs that perform similar tasks and that have speed/function trade offs. Obviously the task at hand may dictate which construct to use but when there's a choice, which do you choose? Those choices can make a significant impact on the final result. This is especially true with the object property example. Its accessed like a simple variable, but does it come at a compute cost? If nothing else the function call used to dispatch the value takes for ever compared to accessing a register value or a memory location.
I often think of writing fast code as "being lazy on the computer's behalf". The bottom line is that the less the computer has to do the faster it happens. So a developer needs to understand the work he's asking of his tools and plan a course that produces the least amount of compute effort to complete the desired task. And this goes not only for instruction cycles in the CPU but other resources as well. Disk is slower than RAM. How fast is that connection to a service? If you take the time to learn the compute cost up front, then as you write code it becomes second nature to choose the lazy way to get things done.
I have little need for a profiler on my own code as I usually understand where the effort is getting expended. And that should be the way with most developers. I do agree there are times where doing something with the best performance possible may require much more effort. So go ahead write the lesser code and plan to come back to supe it up later. There are also times where due to the use case the lesser effort may be all that's necessary. After all today's hardware has a billion+ instructions per second to burn. When I started I only had a few hundred thousand.
Ok... I just have to share a little story: I have a "string list" class in FreePascal. I originally wrote it eons ago in Delphi. I did so because I couldn't understand why there version was so slow. I even got fancy and put in a binary search mechanism, for when the list was sorted. The trick to that was that the list has to be maintained sorted. I figured this was good enough for many tasks. Years later I found that binary search mechanism was really making things slow. This caused me great consternation. Was my binary search doing too much even though it was finding the target?
I then tracked it down to the list inserts causing the slowness. Strings in this language are basically stored as pointers to the reference counted real data somewhere else, which typically doesn't move. To insert things in sorted order I just move a slice of pointers (strings) down and insert the new one. Yes, a pure binary tree implementation would be faster. But moving pointers around should be relatively easy work, if there aren't a ton of them.
Long story short: after looking at the assembly language produced by the compiler I found the flaw. For every string move in the array it was making a pair of function calls: increment reference count, decrement reference count. Honestly I would have thought this would be done "in line". I don't think its a complex task. Relatively speaking function calls are expensive in clock cycles. In this case easily 100x more work than I thought was happening. Plus what else was it doing? This extra effort is also multiplied by the number of strings being moved. Well... I'm not adding or subtracting any string references, just moving pointers around in a list. So I casted the strings into pointers while moving them. The result was a staggering increase in speed. In a tool I use a lot, this was a big win for me.
- Automated testing is a must.
I admit I've struggled a lot with this over the years. The tests have to be written and debugged too. Seems like double the work. But with the instability of today's software climate you will need to run those tests again and it pays to have them ready to go! There are also things where having a platform from which to test without mucking with more complex and sensitive code just makes life easier. Think: a simple snippet of code to encode and decode a piece of data is easier to manipulate and debug with than it is to alter the TLS stack in a web server. Especially if you don't want to break the web server.
Taking the time to prove code before its put into a complex project can save substantial grief by preventing odd errors from showing in the apps that rely on the unverified code. And when you stumble into that edge case where your code breaks... you add another test to your test suite to debug your code and make sure it works later. Then when the libraries and tools that you use get updated... just re-run the tests to see if anything broke! Those second, third, ... times are when you really appreciate having it automated and where ginormous time savings occur. So unless its a one off case, and even then its debatable, its a must to write tests! Now for a good method to test GUIs...
- Software does not decay.
I'm surprised at how many people think that software goes bad, like stuff left too long in the refrigerator. If its too old it won't work any more. Or it will develop more problems the older it is, as if the software is decaying. Software is one of the few endeavors mankind can do that is nearly eternal. As long as there is an appropriate platform to run it on and a viable storage medium, it will continue to do exactly the same thing it did when it was first produced. My Infocom software from the 1980s is still running on several devices. The problem is that storage media fades and the software could be come corrupt and / or lost altogether. But technology also marches on. A suitable computer and OS may not be readily available to run the software on. But emulators can be written!
Still... software does not decay.
- Software does not need constant updating.
I think the myth that software must always be updated first germinated in people's minds because software vendors are always upgrading their software. But this is a necessity of the business model, not a function of the software itself. Business needs something new to keep people buying. If the software accomplishes a task, it will continue to do so indefinitely. I wrote a program for some guys and checked back on it over the years. After 5 years it was scrapped because the hardware it was running on died and there were now canned alternatives available that could do the same things with a few more bells and whistles.
That said, changes to the platform or discovered security issues may warrant and/or require changes to a program. But it amuses me when someone talks about an OpenSource software project and they figure its not useful because it hasn't been updated in over a year. Perhaps the developer maintaining it has it doing all that (s)he wanted it to do?
- "Big" costs! In other words is bad.
This has to be considered relative to the task at hand. Some things are just big and there is nothing that can be done about it, but to pay the cost. Big software and big data always have a cost to them. By big data I'm referring more to the way the data is stored, not how much of it there is. But obviously there are costs there too. It seems to me that throughout history mankind has always had a wasteful mindset towards every resource they discover until it becomes scarce commodity. And computing resources are no exception.
Big software requires more RAM and disk space. It will take longer to load. It means fewer programs can be run on the same machine at the same time. All that code has to be executed at some point and that means execution times could suffer. Cache misses are more likely to occur. That RAM has to be shuttled into and out of CPU. Cache contention goes up. OS resource management tables have to increase. If needed to scale up, more hardware will have to be brought to bare. With projects written in interpreted languages all that code has to be parsed on a regular basis, even if not executed.
And then there's the development costs of maintaining it. We have finite minds. Only so much can be adequately managed in them. It will take longer to find and fix things. Its more difficult to make changes cleanly because things will fall out of the mind. Bringing new talent on board becomes more difficult since there is much more to learn.
Storing data in larger formats (XML anyone?) can be a real performance killer. More disk is needed. More transfer between disk and RAM, RAM and CPU. Caches will be less effective since there is more to be cached. And the list goes on and on. It always amazed me how I could pull something like a phone number from all the records in a data file and compile it down to a smaller focused list file and the same kind of sequential scan would happen several hundreds of times faster, than doing that scan on the original table. Less data takes less time. This is part of what makes indexes fast.
- Benchmarking on fast hardware is not benchmarking.
Are you benchmarking the hardware or the software? Fast hardware obscures poor software performance. Impressive performance figures from lesser hardware is truly impressive. A while back I was amused when benchmark tests run on older hardware showed minutes of difference in something that only took a few minutes to run. But when put on more modern hardware there was a barely discernable difference (fractional seconds). On fast hardware you really have to scale up your tests to determine if things are really improving or not. I did this recently with a mini-suite of language benchmarks I've used over the years. Tens of thousands of iterations used to be enough now hundreds of thousands or millions are required to distinguish the difference. And the truth is that those little differences will make themselves apparent as your application's load scales up.
- You will pay for not KISSing all things computing.
Growing up learning to program from my Dad, he taught me that "KISS" meant: Keep It Simple Stupid. There are many derivations of the acronym from the harsh military versions down to the less offensive: Keep It Super Simple. I took it to mean that you're "stupid" if you don't keep it simple.
Over the years I have found that going the smart, fancy, complex or the less simple path is: prone to error, prone to breaking, more difficult to maintain, more difficult to understand later, less likely to work in a greater array of scenarios. When I ran across the SysV script based init system I was thrilled. How many times had I had to fight with smarter systems to get services launched. The simple, non-thinking system of just launching a service is more likely to work in cases where things are broken. Over the years I've been amazed at how often this has proven true.
I think it was Einstein who has been quoted as saying something to the effect of, "True genius is in simplicity." I firmly believe in all things KISS when related to computers. It saves a lot of heart-burn later. And there are many non-computing applications where this is still true!
- If you DRY you won't stumble.
For those who don' t know this acronym means: Don't Repeat Yourself. This is pretty simple. The more times something is repeated the more places you have to remember to fix it when things change. The more things to remember the less likely you are to remember them all. When it comes to data if a value is stored in more than one location then there is a good chance of a discrepancy. And now you have to violate KISS and come up with a mechanism for resolving those discrepancies.
- 1st and most important law of web development: You can NEVER trust the client (browser).
You say, "But I wrote the app running on the client." Did you? Sad truth of the matter is that the world is now filled with people who's sole purpose in life is to break into your internet-present service. The apps they are using aren't yours. Today the are so many tools available that make it easy to present data to a server. This is why I feel web development is a drudgery. Its twice the work and it violates DRY! Your checks and balances HAVE to be on the server. But often you want them on the client, to give the user that "not so much like a browser, but more like an app" feel. So then you put them in the client software too.
I recently demonstrated to the owners of an eCommerce site that with all the pricing math in the browser and the server just taking the client's word for it, that it was trivially easy to "name your own price". I placed a $1,000 order and paid $1. It wouldn't accept $0. Unfortunately I've seen a lot of professionally written code with this flaw. If nothing else this could put your company's reputation in jeopardy! At worst it can bankrupt you.
- Version 1 is always sub-optimal
I'm not referring specifically to a program released as a literal v1.0, although it applies. Basically I'm referring to the first version of any piece of software written by a developer: subroutines, modules, scripts, applets, apps, ... The larger the piece of software the more it applies. The first time is always a learning experience. Its possible that something is so simple that it can be done the best it can be done in the first writing. But as things grow in complexity this certainly is not the case. The first writing is going to bring to light all those things you didn't think of. Or if you look over the code after its initial writing, ways to simplify, optimize, ... will likely be apparent.
That old saying, "The devil is in the details," probably applies to software more than most things. Take the simple example of adding two file path components together. How many times has it been written like, "x = y + '/' + z"? Yet any slightly seasoned developer can point out the pitfalls: potential security problems, inability to compare paths coming from different sources, ... After you've stumbled into of few of those situation it becomes apparent there are more details to be concerned with. Hence v1 wasn't the best it could be.
- Good error reporting is critical
I know nobody wants to invest a ton of effort into error reporting. After all its the working function we are out to obtain. I'm regularly tempted to shortcut and report errors like "it didn't work". And when I do I'm usually the first to suffer from it. Imagine a login process that can't read its user list, for some reason, so it reports "access denied". You bang your head for a while repeatedly retrying your login. I'm sure all of us have chased a phantom problem or two due to vague or misleading error messages.
But the worst is probably the "silent failure". Where do you start troubleshooting with those? How long does it take to realize something is broken? Probably the most annoying offender on this front is Apple with their Mail program. I don't use it but I constantly get queries from users asking why they can't get their email. At the same time they're getting locked out of the system because Apple thought it wise not tell the user there was a problem with their credentials but instead retry rapidly and continuously to login with the bad credentials. I guess some designer thought that was more friendly. Like they are going to magically fix themselves? In all fairness I think they do put a teensy triangle next to the inbox. But reporting the login error and presenting the login dialog would be much more useful! Those devs need to be whipped with a wet noodle!