Can someone here walk me through how big O notation works? I understand that it's basically how long it would take the program to get though everything but how would you do something like showing that n is O(nlogn)?
Instead of "how long", think of it more as "how many iterations, as a function of the input size".
For example:
Question: How many iterations, as a function of the index you're searching for, does it take to find the 7th item in an array?
Answer: 1. You just return the 7th item
Question: How many iterations, as a function of the index you're searching for, does it take to find the 7th item in a linked list?
Answer: 7. Start at the beginning and move to the next item until you do that 7 times.
Question: How many iterations, as a function of the array size, does it take to the find the n'th item in a linked list?
Answer: n.
Note the difference here between questions 2 and 3. Questions 2 and 3 translate into code as if you're writing this function:
Code:
int find_7th_item(Node *head);
int find_nth_item(Node *head, int n);
The first one always takes 7 iterations, it doesn't matter what you do. That's called
constant time and is always written O(1). The second one takes n iterations, the number of iterations changes depending on how you call it. In this case it's called linear time, and written O
.
O
is worse than O(1) because with a big enough argument, your algorithm will begin to run slower.
What about more complicated examples?
Suppose you want to write this function:
Code:
int dump_binary_search_tree(Node *root, int levels_deep)
Assume the BST is complete, in the sense that every single node has both a left and a right.
To go 0 levels deep: 1 iteration
To go 1 level deep: 3 iterations
To go 2 levels deep: 7 iterations
To go 3 levels deep: 15 iterations
So it's growing exponentially. The exact number of iterations is 2^(levels_deep + 1) - 1. You can draw this out on paper to prove it. If you want to go 7 levels deep, you will have to iterate over 255 nodes. So we say this has O(2^n) complexity.
On the other hand, n could be somethign else. If n is the number of nodes in the tree, then it's O
complexity (because if there's 255 nodes in the tree, you're going to iterate over 255 items).
What if, on the other hand, we want to know about adding an item to a binary search tree? An arbitrary number, we have no idea where it's going to end up.
In the best case scenario you've got a tree that is actually a straight line (imagine you inserted the numbers 2, 3, 4, 5, 6, 7 in that order, it would just be a straight line with each node having it's "right" pointer set) and now you try to insert the value 1. Boom, only 1 place it can go, on the left. O(1). On the flip side, the *worst* case is where you have that exact same tree, and you try to insert the number 8. Now that's O
, because it's literally the last item you examine in the tree after you've examined every other item.
Best and worst case are fairly rare though, usually what people care about is average case. And in the average case (assuming you are working with a balanced tree), there will be about log
levels in the tree (remember back to the earlier example about how iterating over every item is 2^n in terms of # of levels. For any balanced tree, there are 2^levels nodes, and log_2(nodes) levels. Just basic discrete math.
So on average, you will have to look at log
nodes before you find the place where the new item goes. So that's O(log n).
So what's O(n log n) mean then? Imagine you are trying to build a binary search tree from scratch. You've got a list of n numbers, and you want to make a binary search tree out of it.
Inserting 1st item: log(1)
Inserting 2nd item: log(2)
Inserting 3rd item: log(3)
Inserting nth item: log
Inserting all the items = log(1) + log(2) + ... + log
= log(1*2*...*n) = log(n!).
Note this is actually less than n log n, because n log n = log(n^n), and n! < n^n.
Finding an item in a binary search tree is basically the same as inserting, because to insert first you have to find, then you just move some pointesr around. So finding is O(log n) as discussed when I mentioned inserting.
Say on the other hand you've got peoples' data records in a binary search tree sorted by name, and you want to go through them in a particular order (you already have the names in some external order) and you want to process them in exactly that order. That requires n lookups, each of which is O(log n), hence giving you O(n log n)