The nice thing about compressing normals is that each is quantized in its own 0-1 range, so it's orthogonal to dimensions of the space, topology etc. So this case really isn't more demanding than other use of normal-maps to date.
...
Can you clarify what you mean here? What has 5ms budget have to do with acceleration structure choice, or it being fixed/not fixed?
I'll reply to the easiest ones first, then do another reply.
The key word with the performance metric they gave for nanite (IMHO) was they said rasterized, inferring the entire process bypasses vertex and geometry shaders and just works in the fragment shader like SDF solutions seem to. So IMO the BVH cost would have also been listed if it was burdening that hardware, especially as we know that the character in the scene isn't rendered as part of the 90% scene geometry (they list in a slide) that nanite does, so the character is rendered the traditional way with polygon primitives and lit using hardware accelerated RT (BVH) to try and blend with lumen's scene lighting, so contention for BVH resources by nanite would surely be an important performance consideration IMO - if it does use BVH with the fragment shader.
As for the normal map, I think we are talking at cross purposes. I understood it to be just the storage size, not actually a normal map.
So (IMO) for nanite triangles with a UV channel , 1M nanite triangle (verts? locations?) + corresponding UV channel - whatever nanite's format is - occupies the same storage as 4Kx4Kx4bytes = 64MB.
/edit
Actually, thinking about the wording, the 1M triangles with a UV channel are artist numbers while using Studio Max , etc, not nanite. So that 1M asset using uv/bump mapping that maybe looks closer to 5M, I assume they are saying takes up a 4K normal map worth of storage when encoded to nanite's geometry - but without any more details on the format nanite encodes that data. However, given that they say they are only using normal maps(not bump maps) on the warrior (shield scratches) and the rest is geometry, it seems reasonable to assume nanite geometry encodes the lower frequency UV channel perturbing the 1M polys as nanite triangles.
edit/
If you are saying that the inaccuracy accommodated by lossy normal maps - textured on primitives of uncompressed verts - is the same inaccuracy primitive verts could handle, and them still accurately generate shadow maps and undergo shadow sampler comparison lookups to generate soft shadows like the grave yard scene, then you are probably going to need to provide me a link to an example where that's being achieved. AFAIK verts for scene geometry don't like lower precision, and especial biased lossy compression that 3Dc effectively produces. For normals - rather than verts - it is easy enough as they aren't critical - eg scratches on a warrior's shield - and can be renormalized to counter correct because they are a unit length, anyway.