C++ Notes: Algorithms: Big-Oh Notation
Purpose
It's useful to be able to estimate the cpu or memory resources an algorithm
requires. This "complexity analysis" attempts to characterize
the relationship between the size of the data and resource
usage with a simple formula. This can be useful if you are testing a program
with a small amount of data, but later production runs will be with
large data sets.
Notation
The "O" stands for "order of".
Dominance
Because big-oh notation is only concerned with what happens for very
large values of N, only the largest term in the formula is needed.
For example, the number of operations in some sorts is N2 - N.
For large values, N is insignificant compared to N2,
therefore O(N) is N2, and the N term is ignored. Of course,
for small values of N it may be important in practical cases, but big-oh
only concerns large values.
Best, worst, and average cases
Typical Orders
Often people characterize the complexity as
constant O(1),
logarithmic O(log N),
linear O(N),
polynomial O(Nk),
exponential O(2N).
Here is a table of some typical cases. This uses logarithms to base 2, but
these are simply proportional to logarithms in other base.
| n | O(1) | O(log N) | O(N) | O(N log N) | O(N2) | O(N3) |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| 2 | 1 | 1 | 2 | 2 | 4 | 8 |
| 4 | 1 | 2 | 4 | 8 | 16 | 64 |
| 8 | 1 | 3 | 8 | 24 | 64 | 512 |
| 16 | 1 | 4 | 16 | 64 | 256 | 4,096 |
| 1024 | 1 | 10 | 1,024 | 10,240 | 1,048,576 | 1,073,741,824 |
Caution
Although complexity analysis can be very useful, it also can fail.
- Too hard to analyze. Many algorithms are very hard to analyze
mathematically.
- Average case unknown. Often the exact nature of the average
data is unknown, so analysis of this important case is impossible.
- Unknown constant. Walking and travelling at the speed of light
are both O(N) ways to gets somewhere, but one is rather faster than the
other. Big-oh analysis only tells you how it grows with the size of the
problem, not how efficient it is.
- No large data sets. If the program isn't going to handle large
amounts of data, it may not matter how inefficient the algorithm is.
A simple, clear algorithm may be more important.
Benchmarks
Big-oh notation can give some very good ideas about performance for
large amounts of data, but the only real way to know for sure is
to actually try it with large data sets.