Yes, but how do the instructions get to the cache?
Think of it like this.
You're a CPU, and you are drinking beer (aka, executing instructions). You reach into the six-pack that's right next to you (L1 cache), but you're all out (aka, cache miss). Then, you get up and head into your fridge (L2 cache) to see if you have another six pack you can bring to your couch. If you don't, then you hop in your car and head to the supermarket (RAM). When you're at the supermarket, you're going to bring all of the beer that can fit in your fridge (fill your L2 cache). Once you get home and put all the beer in your fridge, you then take a six pack and head back to your couch (fills your L1 cache). Then you start drinking the beer again, and the cycle goes on.
Since L2 cache is typically around 1-2MB, you can see why higher bandwidth doesn't really matter in this scenario. What's most important is the latency involved of getting things from the RAM. In my example, it doesn't matter if you use a mini van or a semi truck to get your beers from the store. You're limited by how much you can put in your fridge (L2 cache).
Now, a GPU on the other hand has a different architecture and a different set of problems. Since they're dealing with larger data sets (textures can be quite large in memory!), you want to maximize the amount of data you can push. Instead of the CPU using a mini-van to move the beer around (which is most efficient for its uses), the GPU would prefer to use a semi-truck, even though it would take a bit longer to get to its destination.