If you're watching a dark movie and then suddenly a 5,000nit image fills the display, your eyes are going to burst.
Yeah, definitely fortunately we have pupils to deal with some of that. In the Videogame world they will perform artificial pupil adaptation to assist with this.
It's a big subject, if you go and look at the link in the OP, you can see more information about how the current TVs perform.
So whilst they can reach 1500nits in a small area on the screen , for prolonged period or in larger areas on the screen (or full screen) the Nits are almost a 3rd of that level of brightness, presumably because the LEDS would melt or explode.
I would presume that the same is the case for the bigger standards, which are all to do with peak brightness.
It's not about being able to simply illuminate the screen in some awful super tanning machine in your home, but have a TV capable of producing the peak brightness in places in your image.
If I take a photography example showing blown highlights (I know photography examples are somewhat contentious), here is a scene which you would have no problems looking at in real life, but as highlighted in red, these are areas that fall outside of the upper end range of the camera/display

Being able to illuminate these areas to better represent how they look in real life makes the image more real. And rather than control what the user sees to make up by this by making areas of the image darker, the users eyes can adjust in a way that is natural.