For 3D titles, unless you're really, really, really thinking way ahead that you're going to be using rollback-based netcode, you will likely have to choose between having less prediction frames to work with (having to sample the skeleton for everything -> having to sample anims and transforms during rollbacks -> clear extra CPU tax), or spending a non-trivial amount of memory (baked skeletons per frame) to ensure you can generate as many prediction frames as possible without losing framerate.
Baking doesn't *have* to take that much memory. For starters what's needed is the hitboxes, not the skeletons, so that's already not that much data. For SF4, count around a dozen hitboxes max per frame, and proper quantization will give you around twelve bytes per hitbox (and roughly the same for capsules, for the games that use them), so ~150 bytes per frame, which is pretty compact already. For 60fps sampling, one megabyte gives you 111 seconds of hitboxes, which should be enough for a full SF4 animation set IIRC Then realize that you don't actually have to sample your hitboxes at 60fps, you can perfectly do it like 2D games and contend with only a handful of hitbox keyframes per move, and the memory requirement for hitboxes will be, well, the same they are for 2D games.