No, I think you are not appreciating the major difference between a fragment, and a pixel - which is Microsoft/Nvidia's fault for not using the RenderMan/3DLabs/Opengl original terminology: fragment shaders, and instead confusingly and idiotically called them pixel shaders.
Yes, I'm using mostly Nvidia/Microsoft terminology. Both Nvidia and DirectX are the current standards on PC, not OpenGL.
Even Unreal Engine uses Pixel Shaders, for the terminology.
Shading just one pixel still results in multiple fragment in most regular rendering scenarios where perspective projection of shaded, texture mapped geometry is rendered in that pixel.
VRS is intended to take the scenario where too many fragments are being calculated at distance beyond what the minification can allow the blended framebuffer pixel to display, meaning it is wasted processing.
The VRS artefacts from lowering the shading rate typically all come down to oversights of the required direct or indirect sample rate(fragments per pixel) needed to avoid undersampling relative to non-VRS.
Overdraw of pixel shaders always occurs, it's just a consequence of us not being able to remove all hidden surfaces before we start the rendering pipeline.
It's also a consequence of AMD and Nvidia having an hardware rasterizer using quad pixels.
Yes, VRS and MSAA have consequences on how these things work. It's just how it works.
But what I'm saying, in simple terms, is that MSAA increases the samples per pixel and VRS reduces the samples per pixel. Both in a localized fashion.
This means, MSAA increase image quality but reduces performance. While VRS reduces image quality, but increases performance.
And the result, is having to shade fewer pixels. Or fragments.