From knowing absolutely nothing about how the thing works, I'll try this.
1) The television flashes when you "shoot" it, and that sends a signal to the light sensor in the barrel of the Zapper.
2) Different signals are sent out by different parts of the TV image - a "miss" signal for parts of the screen without targets and a "hit" signal for each target.
3) Since the Zapper's sensor can only "see" a small part of the light flash, it gets only one of the signals that are sent out. The Zapper relays back to the system the input it received.
4) The cartridge takes that data and acts accordingly.
EDIT: Ah, I was pretty much wrong -- according to the link, by counting the time between the "shot" and the redrawing of the screen, the gun can determine its exact position.