Here's a related anecdote from the late 1990s. I was one of the two programers (along with Andy Gavin) who wrote Crash Bandicoot for the PlayStation 1.
RAM was still a major issue even then. The PS1 had 2MB of RAM, and we had to do crazy things to get the game to fit. We had levels with over 10MB of data in them, and this had to be paged in and out dynamically, without any "hitches"loading lags where the frame rate would drop below 30 Hz.
It mainly worked because Andy wrote an incredible paging system that would swap in and out 64K data pages as Crash traversed the level. This was a "full stack" tour de force, in that it ran the gamut from high-level memory management to opcode-level DMA coding. Andy even controlled the physical layout of bytes on the CD-ROM disk so thateven at 300KB/secthe PS1 could load the data for each piece of a given level by the time Crash ended up there.
I wrote the packer tool that took the resourcessounds, art, lisp control code for critters, etc.and packed them into 64K pages for Andy's system. (Incidentally, this problemproducing the ideal packing into fixed-sized pages of a set of arbitrarily-sized objectsis NP-complete, and therefore likely impossible to solve optimally in polynomiali.e., reasonabletime.)
Some levels barely fit, and my packer used a variety of algorithms (first-fit, best-fit, etc.) to try to find the best packing, including a stochastic search akin to the gradient descent process used in Simulated annealing. Basically, I had a whole bunch of different packing strategies, and would try them all and use the best result.
The problem with using a random guided search like that, though, is that you never know if you're going to get the same result again. Some Crash levels fit into the maximum allowed number of pages (I think it was 21) only by virtue of the stochastic packer "getting lucky". This meant that once you had the level packed, you might change the code for a turtle and never be able to find a 21-page packing again. There were times when one of the artists would want to change something, and it would blow out the page count, and we'd have to change other stuff semi-randomly until the packer again found a packing that worked. Try explaining this to a crabby artist at 3 in the morning.
By far the best part in retrospectand the worst part at the timewas getting the core C/assembly code to fit. We were literally days away from the drop-dead date for the "gold master"our last chance to make the holiday season before we lost the entire yearand we were randomly permuting C code into semantically identical but syntactically different manifestations to get the compiler to produce code that was 200, 125, 50, then 8 bytes smaller. Permuting as in, "for (i=0; i < x; i++)"what happens if we rewrite that as a while loop using a variable we already used above for something else? This was after we'd already exhausted the usual tricks of, e.g., stuffing data into the lower two bits of pointers (which only works because all addresses on the R3000 were 4-byte aligned).
Ultimately Crash fit into the PS1's memory with 4 bytes to spare. Yes, 4 bytes out of 2097152. Good times.