Software optimization

Optimization of software makes it use less resources to do the same thing. This saves money, providing that the code does something useful.

Typically, programmers minimize the amount of memory, disk-space or time that a program uses.

Optimization is related to refactoring (which makes software do the same thing more readably and maintainably). In many cases, simplified code runs faster and takes less space.

Some other forms of optimization minimize the work of the software design team. For example, safety-critical software requires validation, and the validation can be most of the expense of development.

Often, optimization of one form can cause waste of some other resource. The process of trading-off conflicting advantages is usually done by estimating their costs.

Optimization can be performed at any step of the programming project, from design to runtime usage. Programmers can optimize the design, perhaps (most powerfully) by finding a better algorithm. They may use automatic optimization features of compilers. They may choose to write selected parts of the code in an assembler. They may select those parts by analyzing their code with tools like profilers and debuggers.

Some optimization techniques follow, each big enough to deserve its own Wikipedia article:

don't optimize if you can help it!
- if code rarely runs, don't optimize it. For example, do not optimize error code.
- If optimization can improve performance significantly, optimize.
- If optimization can save a significant amount of battery power on a mobile target system, optimize.
- If a piece of code has not been tested and benchmarked thoroughly, don't optimize. Donald E. Knuth has popularized the saying (coined by Robert W. Floyd) that "Premature optimization is the root of all evil."
- Otherwise, don't.
profile to find critical parts of the software and then only optimize the parts that matter
- Think about what you're doing. A famous counter-example to blind optimization of high-profiling code was discovered by a group at Burroughs. They optimized the most frequently executed piece of code in the operating system: the idle loop of the multitasker. (Though this may seem silly at first, modern operating systems use a similar optimization to reduce CPU power drain during idle moments.)
measure attempted optimizations to see if they actually improved matters
use a better algorithm or data structure to reduce time, space, or both
use a better compiler on your code, or supply different optimization flags to your compiler. However, better compilers can cost a significant amount of money.
rewrite code in a lower-level language (C or assembly language are common choices) to reduce both time and space
use packed bit-fields or packed bytes to reduce memory usage
Check alignment;
- Multi-byte integers run substantially faster on some computers if they are aligned with the data bus (usually on addresses that are multiples of two, four or eight).
- On some computers, memory banks are interleaved, and accessing these on a multiple of the bank nuumber slows the access to the speed of a single bank.
use data instead of code to reduce memory usage
use overlays or virtual memory to reduce memory usage
parallelize
use specialized fast-path code to handle common cases
normalize or denormalize data
reorganize data to reduce false sharing, improve packing, improve locality of reference (by rearranging fields in a record or by moving some fields to other records.)
perform lazy evaluation (only evaluate enough of a data structure to accomplish the needed task)
- Eliminate common (shared) subexpressions
- precomputate, memorize, cache, or use hints to reduce time
Use speculation and prefetching
reduce overhead by batching
- Unroll loops

The compiler optimization page lists techniques employed by compilers. Some of these are too difficult for humans to employ directly, but sometimes a programmer passes the code through an optimizing compiler, and then either tweaks the generated assembly language code by hand or tweaks the source code to convince the compiler to generate better code.

General issues

Need for optimization

Often, programmers begin to optimize all of their code at the design stage, or wherever they see fit afterwards.

While performance is crucial, especially on an embedded system, optimizations made too early may damage a system in other ways, such as portability or expandability.

Optimizations may make the code more difficult to understand, and hence to maintain. Therefore, optimize only where needed. Often quoted are Michael A. Jackson's two rules of when to optimize:

Don't do it.
(Experts only) Don't do it yet.

Bottlenecks have the greatest influence on software performance, because the system runs at the speed of its slowest part. For example, when John Carmack was developing Quake, he noticed that gameplay was slow. Using a profiler, he discovered that the scripting language, QuakeC was interpreting too slowly. He re-wrote the QuakeC to compile just the code it needed. This solved the performance problems more thoroughly than if he optimized the graphical engine (which would also have been more difficult).

General techniques

The general concept of optimization is taking one piece of code and re-writing it in a better (quicker, smaller, more stable) way.

Software consists of algorithms and data structures - and one can trade these off to improve performance. For example, rather than using the slow but simple bubble sort algorithm, a programmer may use merge sort, quicksort, or heapsort. Searching for numbers in a list takes less time when using a binary tree than a linked list.

However, there's a price for almost all of these choices. When a list is already sorted, a quicksort is slower than bubble sort. Merge sort is inefficient for small lists. Binary trees take more memory and demand special runtime maintenance (balancing).

There's another form of optimization, as well. Most programs rely heavily on operating systems. Sometimes one can reduce the resources used by the operating system.

One classic example is the "UxAxPxB problem", first encountered by IBM in its time-sharing systems. Basically, the number of system control blocks was limiting the number of operators that could take orders with a computer system. Since IBM's customers were losing orders, this was a serious business problem.

Past a certain number of users, the number of system control blocks began to greatly increase. Debugging determined that the number of blocks was a product. It was the number of users, times the number of application programs per user, times the number of processes per application, times the number of control blocks per process.

IBM eliminated this problem by inventing "middleware" to handle "transactions," the most common business task affected by the problem. The middleware opened the necessary files, once, and responded to simplified order-entry software through a stateless interface in memory.

References: