What is the best way of learning Big O notation? We're learning in our Data Structures course, but all of it seems very perplex, which is due to the pacing of the course.
Big O notation itself is very simple. It's a way to express the time complexity of a function (it can also be used for memory complexity but we usually use it for time complexity). So, what does that mean? Essentially, it's a way to determine how long a algorithm is going to take to execute. Let's say you've written a sorting function that sorts an array in ascending order. You execute it on an array that's 100 elements long and it finishes in 2 milliseconds. "Nice!", you think. "It works! And it's fast too! 2 ms! That's nothing!"
Then you use it on an array with 100000 elements in it. And then you wait. And you wait. And you wait. It's still not finished. The question is: is it worth waiting for? Maybe it's just about to finish. Maybe it's going to take an hour. Or a day. Or a year. How do we know?
We know by looking at the time complexity!
You're using Bubble Sort to sort the array. Bubble sort has a best case complexity of O
and a worst case complexity of O(n²), where n is the amount of elements to sort.
If we assume the array we sorted is the worst possible one, we have O(n²). This means that the time the algorithm should take to execute is: t = n²x where x is some value depending on how fast your computer can execute code. So, we know 100 elements takes 2ms. That means x = t / n² = 2 / 100² ms = 0.0002 ms.
So, how long will 100000 elements take? t = n²x = 100000² * 0.0002 = 33.333 minutes. Worst case scenario, it'll take us half an hour.
We don't have time to wait that long, so we rewrite our algorithm to be a better one. This time it has a worst case complexity of O(n log n). How long will it takes us now?
100000 * log 100000 * 0.0002 = 332.2 ms = 0.3 seconds. That's a bit better. And just by using a better algorithm. We would have to get a computer that was 100 times as fast in order to use the old algorithm at the same speed. And if we wanted to sort even more elements we'd need an even faster computer.
So algorithmic complexity is very important.
A better computer can't compensate for a poor algorithm.
Big O notation is very handy for calculating how long something is going to take.
But that's if we already have the Big O notation of an algorithm already given to us. How do we go about calculating it for an algorithm we've written? Well, that's when it starts to get a bit more difficult.
Let's have a look at a simple algorithm. This algorithm goes through all the numbers in an array. It then checks if any other number in the array matches that number and if so: prints out that we've got a match.
Code:
for (int i=0; i<n; i++) {
for (int j=0; j<n; j++) {
if (i != j && array[i] == array[j]) {
cout << "Element " << i << " has a match!" << endl;
break;
}
}
}
So, let's try to determine the time complexity for this algorithm. There are two things we are interested in: the best case scenario and the worst case scenario.
Let's start with the best case scenario. What's the best case scenario? Well, the best case scenario is the input that the algorithm will be able to handle the fastest. So, what's that?
Well, we've got two loops, one inside the other. The outer loop will always go from 0 to n-1, regardless of what the contents of the array are. For each such loop, it's going to go through the inner loop. Now, what's the best case scenario for the inner loop?
Well, the inner loop quits as soon as it finds a match. So the best case scenario is if it immediately finds a match every time. When does that happen? It happens when every element of the array is the same. Then all of the elements will match and so every element matches array[0], so there's no need to check the rest of the array. So, in the best case scenario, the inner loop will only check 1 element.
Now we've got all we need to figure out the complexity of the algorithm. The outer loop has n, and the inner loop has 1. In total we get n*1 = n.
The best case scenario has complexity: O
.
But what about the worst case scenario? When's that? Well, the outer loop still always goes from 0..n-1, so that's the same. But the inner loop could potentially have to go through all of the elements all the way to n-1 if there aren't any matches whatsoever. If not a single element matches another, the algorithm will have to go through all of them in the inner loop.
So the other loop has n, the inner loop has n. In total we get n*n = n².
The worst case scenario has complexity: O(n²).
And that's really all there is to it. Of course, in many cases it can be a lot harder than this. Determining what's the best case scenario and what's the worst case scenario isn't always so easy. But the principle is the same.
There's one more thing to think about. In Big O notation, we only care about the "worst offender". If we calculate a complexity that's n² + n/2, all we care about is n², because for very large n:s, the other terms essentially disappear in comparison with how large n² is. So the Big O notation for n² + n/2 is O(n²).
We also don't care about constants like 5n, that just becomes O
. The constant is insignificant when n is really large.
The holy grail of complexity is O(1), meaning the algorithm takes constant time. It takes the same amount of time regardless of how many elements we throw at it.
Ok, so with all that knowledge we should, after a lot of practice, be able to get the best and worst case scenario for any algorithm. There's only one problem with that. Our data is very seldomly ever going to be the absolutely worst case possible or the absolutely best case possible. What we'd really be interested in is some sort of average case. How long it will typically take. Not worst case, not best case, but
for most cases. This measurement exists as well, but it is
way harder to determine. You professor won't expect you to be able to determine that.
Usually, we're not so interested in the best case scenario. We usually want to minimize the worst case and the average case.